diff --git a/modules/webhelp/web/hacking/Makefile.am b/modules/webhelp/web/hacking/Makefile.am index 78a211a91..ca67f8867 100644 --- a/modules/webhelp/web/hacking/Makefile.am +++ b/modules/webhelp/web/hacking/Makefile.am @@ -1,46 +1,38 @@ ## $Id$ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. docdir = $(localstatedir)/www/hacking -imgdir = $(localstatedir)/www/img/hacking - -doc_DATA=index.html modules.html style.html concepts.html directory.html cdsware.el \ - releases.html testsuite.html +doc_DATA = cdsware.el -img_DATA=modules-overview-graph.jpeg +imgdir = $(localstatedir)/www/img/hacking +img_DATA = modules-overview-diagram.jpeg webdoclibdir = $(libdir)/webdoc/hacking - webdoclib_DATA = \ - internals.webdoc \ + hacking.webdoc \ modules-overview.webdoc \ coding-style.webdoc \ common-concepts.webdoc \ directory-organization.webdoc \ - release-numbering-scheme.webdoc \ - test-suite-strategy.webdoc + release-numbering.webdoc \ + test-suite.webdoc -FILESWML = $(wildcard $(srcdir)/*.wml) -EXTRA_DIST = $(webdoclib_DATA) modules-overview-graph.jpeg cdsware.el +EXTRA_DIST = $(webdoclib_DATA) $(doc_DATA) $(img_DATA) CLEANFILES = *~ *.tmp - -%.html: %.html.wml $(top_srcdir)/config/config.wml $(top_builddir)/config/configbis.wml - $(WML) -o\(ALL-LANG_*\)+LANG_EN:$@ $< - $(PYTHON) $(top_srcdir)/po/i18n_update_wml_target.py en $@ \ No newline at end of file diff --git a/modules/webhelp/web/hacking/concepts.html.wml b/modules/webhelp/web/hacking/concepts.html.wml deleted file mode 100644 index a7238afc8..000000000 --- a/modules/webhelp/web/hacking/concepts.html.wml +++ /dev/null @@ -1,120 +0,0 @@ -## $Id$ - -## This file is part of CDS Invenio. -## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. -## -## CDS Invenio is free software; you can redistribute it and/or -## modify it under the terms of the GNU General Public License as -## published by the Free Software Foundation; either version 2 of the -## License, or (at your option) any later version. -## -## CDS Invenio is distributed in the hope that it will be useful, but -## WITHOUT ANY WARRANTY; without even the implied warranty of -## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. -## -## You should have received a copy of the GNU General Public License -## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., -## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. - -#include "cdspage.wml" \ - title="Common Concepts" \ - navbar_name="hacking-common-concepts" \ - navtrail_previous_links="/hacking/>Hacking CDS Invenio " \ - navbar_select="hacking-common-concepts" - -
Version <: print generate_pretty_revision_date_string('$Id$'); :> - -
-The description of concepts you will encounter here and there in the -CDS Invenio. Our interpretation may differ from the practice found in -other products, so please read this carefully. - -1. sysno - (ALEPH|old) system number - - Stands for (ALEPH|old) system number only. Which means that, for - outside-CERN CDS Invenio installations, stands for an 'old system - number' whatever it is, if they want to publicise it instead of our - internal auto-incremented CDS Invenio record identifiers. - -2. recID - CDS Invenio record identifier - - Each record has got an auto-incremented ID in the "bibrec" table - (formerly called "bibitem"). This is the basic "record identifier" - concept in CDS Invenio. - -3. docID - eventual fulltext document identifier - - Each fulltext file may have eventual docID. This will permit us to - interconnect records (recID) with fulltext files (docID), if we - want to. At the moment there is only one-way connection from recID - to docID via HTTP field 856. This is ugly. I think we may - probably profit by introducing recID-docID relationship in several - ways: file protection, reference extraction, fulltext - indexing... (?!) - -4. field - logical field concept such as "reportnumber" - - A bibliographic record is composed of 'fields' such as title or - author. Note that we consider 'field' to be a logical concept, - that is compound and may consist of several physical MARC fields. - For example, "report number" field consists of several MARC fields - such as 088 $a, 037 $a, 909C0 $r. Another example: "first report - number" consist of only one MARC field, 037 $a. - -5. tag - physical field concept such as "088 $a". - - Having defined the concept of 'logical field', let's now turn to - the 'physical field' that denotes basically the concept of 'MARC - field' as defined in MARC-21 standard. In addition to tag, a field - may contain two identifiers to describe the data content, and - subfield codes to denote various parts of the content. See our - HOWTO MARC guide on this. - - Thus said, in the implementation of our bibliographic tables - (bibXXx) we have sort of generalized the term 'tag' to stand for: - - tag = tag code + identifier1 + identifier1 + subfield code - - This convention, while taking some freedom from the MARC-21 - standard, enables us to write things like "field: base number, tag: - 909C0b, value: 11". If this interpretation is indeed too free with - respect to the standard usage of terms, we may change them in the - future. - -6. collection - here we distinguish (i) primary collection concept - and (ii) specific collection concept. - - The (i) primary collections are basic organizational structure of - how the records are grouped together in collections. The primary - collections are used in the navigable search interface under the - ``Narrow search'' box. The (ii) specific collections present an - orthogonal view on the data organization, that is useful to group - together some records from different primary collections, if they - present a common pattern. The specific collections are used in the - search interface under the ``Focus on'' box. - - The primary collections are defined mainly by the collection - identifier ("980 $a,b"); and the specific collections are as - defined by any query that is possible for a search engine to - execute (see also "dbquery" column in the "collection" table). - - In the past we used to use the term "catalogue", that is now - deprecated, and that can be interchanged with "collection". - -7. doctype - stands for web document type concept, used in WebSubmit - - The "document type" is used solely for submission purposes, and - fulltext access purposes ("setlink"-like). For example, a document - type "photo" may be used in many collections such as "Foo Photos", - "Bar PhotoLab", etc. Similarly, one collection can cover several - doctypes. (M:N relationship) - -8. baskets, alerts, settings - covering personal features - - Denote personal features, for which we previously used the terms - "shelf" and "profile" that are now deprecated. - -- end of file - - -diff --git a/modules/webhelp/web/hacking/directory.html.wml b/modules/webhelp/web/hacking/directory.html.wml deleted file mode 100644 index a343c4749..000000000 --- a/modules/webhelp/web/hacking/directory.html.wml +++ /dev/null @@ -1,203 +0,0 @@ -## $Id$ - -## This file is part of CDS Invenio. -## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. -## -## CDS Invenio is free software; you can redistribute it and/or -## modify it under the terms of the GNU General Public License as -## published by the Free Software Foundation; either version 2 of the -## License, or (at your option) any later version. -## -## CDS Invenio is distributed in the hope that it will be useful, but -## WITHOUT ANY WARRANTY; without even the implied warranty of -## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. -## -## You should have received a copy of the GNU General Public License -## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., -## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. - -#include "cdspage.wml" \ - title="Directory Organization" \ - navbar_name="hacking-directory-organization" \ - navtrail_previous_links="/hacking/>Hacking CDS Invenio " \ - navbar_select="hacking-directory-organization" - -
Version <: print generate_pretty_revision_date_string('$Id$'); :> - -
-Please find some notes below on how the source (as well as the target) -directory structure is organized, where the sources get installed to, -and how the visible URLs are organized. - -1. CDS Invenio will generally install into the directory taken from - --with-prefix configuration variables. These are discussed in points - 2 and 3 below, respectively. - -2. The first directory (--with-prefix) specifies general CDS Invenio - install directory, where we'll put CLI binaries, Python and PHP - libraries, manpages, log and cache directories for the running - installation, and any other dirs as needed. They will all live - under one common hood. - - For example, configure --with-prefix=/opt/cds-invenio, - and you'll obtain the following principal directories: - - /opt/cds-invenio/ - /opt/cds-invenio/bin - /opt/cds-invenio/lib - /opt/cds-invenio/lib/php - /opt/cds-invenio/lib/python - /opt/cds-invenio/lib/wml - /opt/cds-invenio/var - /opt/cds-invenio/var/cache - /opt/cds-invenio/var/log - - with the obvious meaning: - - - bin : for command-line executable binaries and scripts - - - lib/php : for our own PHP libraries, see below - - - lib/python : for our own Python libraries, see below - - - lib/wml : for our own WML libraries, see below - - - var : for installation-specific runtime stuff - - - var/log : for all sorts of runtime logging, e.g. search.log - - - var/cache : for all sorts of runtime caching, e.g. OAI - retention harvesting, collection cache, etc - - This scheme copies to some extent the usual Unix filesystem - convention, so it may be easily expanded later according to our - future needs. - -3. The second directory (prefix/var/www) contains Web scripts (PHP, - mod_python), HTML documents and images, and so on. This is where - webuser-seen files are located. Basically, the files there contain - only the interface to the functionality that is provided by the - libraries stored under the library directory. - - The prefix/var/www directory is further structured according to - whom it provides services. We distinguish user-level, admin-level - and hacker-level access to the site, as reflected by the visible - URL structure. - - a) The user-level access point is provided by the main WEBURL - address and its subdirs. All the user-level documentation is - available under WEBURL/help/. The module-specific user-level - documentation is available under WEBURL/help/<module>/. - - b) The admin-level access is provided by WEBURL/admin/ entry - point. The admin-level documentation is accessible from the - same place. The admin-level module-specific functionality and - help is available under WEBURL/admin/<module>/. (If - it's written in mod_python, it usually points to - WEBURL/<module>admin.py/ since we configure the server - to have all mod_python scripts under the prefix/var/www root - directory.) - - c) The hacker-level documentation is provided by WEBURL/hacking/ - entry point. There is no hacker-level functionality possible - via Web, of course, so that unlike admin-level entry point, - the hacker-level entry point provides only a common access to - available hacking documention, etc. The module-specific - information is available under WEBURL/hacking/<module>/. - -4. Let's now return a bit more closely to the role Python and PHP - library directories outside of the Apache tree: - - /opt/cds-invenio/lib/php - /opt/cds-invenio/lib/python - - Here we put not only (a) libraries that may be reused across CDS - Invenio modules, but also (b) all the "core" functionality of CDS - Invenio that is not directly callable by the end users. The - "callable" functionality is put under "prefix/var/www" in case of - web scripts and documents, and under "bindir" in case of CLI - executables. - - As for (a), for example in the PHP CDS Invenio library you'll find - currently the common PHP error handling code that is shared between - BibFormat and WebSubmit; in the Python CDS Invenio library (in fact, - CDS Invenio Pythonic 'module', but we are reserving the word 'module' - to denote 'CDS Invenio module' in this text) you'll find config.py - containing WML-supplied site parameters, dbquery.py containing DB - persistent query module, or webpage.py with templates and functions - to produce mod_python web pages with common look and feel. These - could and should be reused across all our modules. Note that I - created only a small number of "broad" libraries at the moment. In - case we want to reuse more code parts, we'd refactor the code more, - as needed. - - As for (b), for example the existing search engine was split into - search.py that only contains three "callable" functions, which goes - into prefix/var/www, while the search engine itself is composed of - search_engine.py and search_engine_config.py living under LIBDIR. - In this way we can easily create "real" CLI search, that will - depend only on the search libraries in LIBDIR, and that will get - installed into BINDIR. - - To recap: - - - For each CDS Invenio module, I'm differentiating between - "callable" and "core" parts. The former go into - prefix/var/www or BINDIR, the latter into LIBDIR. - - - Our PHP/Pythonic libraries contain several sorts of thing: - - - the implementation of the "callable" functions - - - non-callable internal "core" or "library" code parts, as - stated above. Not shared across CDS Invenio modules. - - - utility code meant for reuse across CDS Invenio modules, such - as dbquery.py - - - Pythonic config files out of user-supplied WML (non-MySQL) - configuration parameters (see - e.g. search_engine_config.py) - -5. The same strategy is reflected in the organization of source - directories inside CDS Invenio CVS. Each CDS Invenio module lives in a - separate directory located under "modules" directory of the - sources. Further on, each module contains usually several - subdirectories that reflect the above-mentioned packaging choice. - For example, in case of WebSearch you'll find: - - ./modules/websearch - ./modules/websearch/bin - ./modules/websearch/doc - ./modules/websearch/doc/hacking - ./modules/websearch/doc/admin - ./modules/websearch/lib - ./modules/websearch/web - ./modules/websearch/web/admin - - with the following straightforward meaning: - - - bin : for callable CLI binaries and scripts - - - doc : for documentation. The user-level documentation is - located in this directory. The admin-level - documentation is located in the "admin" subdir. The - programmer-level documentation is located in the - "hacking" subdir. - - - lib : for uncallable "core" functionality, see the comments - above - - - web : for callable web scripts and pages. The user- and - admin- level is separated similarly as in the "doc" - directory (see above). - - The structure is respected throughout all the CDS Invenio modules, a - notable exception being the MiscUtil module that contains subdirs - like "sql" (for the table creating/dropping SQL commands, etc) or - "demo" (for creation of Atlantis Institute of Science, our demo - site.) - -- end of file - -diff --git a/modules/webhelp/web/hacking/internals.webdoc b/modules/webhelp/web/hacking/hacking.webdoc similarity index 100% rename from modules/webhelp/web/hacking/internals.webdoc rename to modules/webhelp/web/hacking/hacking.webdoc diff --git a/modules/webhelp/web/hacking/index.html.wml b/modules/webhelp/web/hacking/index.html.wml deleted file mode 100644 index 058a15961..000000000 --- a/modules/webhelp/web/hacking/index.html.wml +++ /dev/null @@ -1,94 +0,0 @@ -## $Id$ - -## This file is part of CDS Invenio. -## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. -## -## CDS Invenio is free software; you can redistribute it and/or -## modify it under the terms of the GNU General Public License as -## published by the Free Software Foundation; either version 2 of the -## License, or (at your option) any later version. -## -## CDS Invenio is distributed in the hope that it will be useful, but -## WITHOUT ANY WARRANTY; without even the implied warranty of -## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. -## -## You should have received a copy of the GNU General Public License -## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., -## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. - -#include "cdspage.wml" \ - title="Hacking CDS Invenio" \ - navbar_name="hacking" \ - navtrail_previous_links="" \ - navbar_select="hacking" - -Welcome to the CDS Invenio Developers' corner. Before diving into the -source, make sure you don't miss our /help/">user-level and /admin/">admin-level documentation as well. And now, back to the source, and happy hacking! - -
-- -- -
-- Common Concepts
-- Summarizing common terms you will encounter here and there.
- -- Coding Style
-- A policy we try to follow, for good or bad.
- -- Release Numbering Scheme
-- Presenting the version numbering scheme adopted for CDS Invenio stable and development releases.
- -- Directory Organization
-- How the source and target directories are organized, where the -sources get installed to, what is the visible URL policy, etc.
- -- Modules Overview
-- Presenting a summary of various CDS Invenio modules and their relationships.
- -- Test Suite Strategy
-- Describes our test suite strategy.
- -
-diff --git a/modules/webhelp/web/hacking/modules.dia b/modules/webhelp/web/hacking/modules-overview-diagram.dia similarity index 100% rename from modules/webhelp/web/hacking/modules.dia rename to modules/webhelp/web/hacking/modules-overview-diagram.dia diff --git a/modules/webhelp/web/hacking/modules-overview-graph.jpeg b/modules/webhelp/web/hacking/modules-overview-diagram.jpeg similarity index 100% rename from modules/webhelp/web/hacking/modules-overview-graph.jpeg rename to modules/webhelp/web/hacking/modules-overview-diagram.jpeg diff --git a/modules/webhelp/web/hacking/modules-overview.webdoc b/modules/webhelp/web/hacking/modules-overview.webdoc index b733cb9a1..7409a831b 100644 --- a/modules/webhelp/web/hacking/modules-overview.webdoc +++ b/modules/webhelp/web/hacking/modules-overview.webdoc @@ -1,274 +1,274 @@ ## $Id$ ## This file is part of CDS Invenio. ## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. ## ## CDS Invenio is free software; you can redistribute it and/or ## modify it under the terms of the GNU General Public License as ## published by the Free Software Foundation; either version 2 of the ## License, or (at your option) any later version. ## ## CDS Invenio is distributed in the hope that it will be useful, but ## WITHOUT ANY WARRANTY; without even the implied warranty of ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU ## General Public License for more details. ## ## You should have received a copy of the GNU General Public License ## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., ## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.- -
-- BibClassify Internals
-- Describes information useful to understand how BibClassify works, -the taxonomy extensions we use, how the keyword extraction algorithm works. -
- -- BibConvert Internals
-- Describes information useful to understand how BibConvert works, - and the BibConvert functions can be reused.
- -- BibFormat Internals
-- Describes information useful to understand how BibFormat works.
- -- BibRank Internals
-- Describes information useful to understand how the various -ranking methods available in bibrank works, and how they can -be tweaked to give various output.
- -- MiscUtil Internals
-- Describes information useful to understand what can be found inside the miscellaneous utilities -module, like database access, error management, date handling library, etc.
- -- WebSearch Internals
-- Describes information useful to understand the search process -internals, like the different search stages, the high- and low-level -API, etc.
- -- WebAccess Internals
-- Describes information useful to understand the access control process -internals, its API, etc.
- -
CDS Invenio consists of several more or less independent modules with precisely defined functionality. The general criterion for module names is to use the ``Bib'' prefix to denote modules that work more with the bibliographic data, and the ``Web'' prefix to denote modules that work more with the Web interface. (The difference is of course blurred in some cases, as in the case of search engine that has got a web interface but searches bibliographic data.)
Follows a brief description of what each module does. After descriptions the module relationship diagram is presented.
Relationship between the modules:
-
+
Version <: print generate_pretty_revision_date_string('$Id$'); :> - -
CDS Invenio consists of several more or less independent modules with -precisely defined functionality. The general criterion for module -names is to use the ``Bib'' prefix to denote modules that work more -with the bibliographic data, and the ``Web'' prefix to denote modules -that work more with the Web interface. (The difference is of course -blurred in some cases, as in the case of search engine that has got a -web interface but searches bibliographic data.) - -
Follows a brief description of what each module does. After -descriptions the module relationship diagram is presented. - -
Relationship between the modules: - -
- - diff --git a/modules/webhelp/web/hacking/release-numbering-scheme.webdoc b/modules/webhelp/web/hacking/release-numbering.webdoc similarity index 100% rename from modules/webhelp/web/hacking/release-numbering-scheme.webdoc rename to modules/webhelp/web/hacking/release-numbering.webdoc diff --git a/modules/webhelp/web/hacking/releases.html.wml b/modules/webhelp/web/hacking/releases.html.wml deleted file mode 100644 index f0a964e0d..000000000 --- a/modules/webhelp/web/hacking/releases.html.wml +++ /dev/null @@ -1,104 +0,0 @@ -## $Id$ - -## This file is part of CDS Invenio. -## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. -## -## CDS Invenio is free software; you can redistribute it and/or -## modify it under the terms of the GNU General Public License as -## published by the Free Software Foundation; either version 2 of the -## License, or (at your option) any later version. -## -## CDS Invenio is distributed in the hope that it will be useful, but -## WITHOUT ANY WARRANTY; without even the implied warranty of -## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. -## -## You should have received a copy of the GNU General Public License -## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., -## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. - -#include "cdspage.wml" \ - title="Release Numbering Scheme" \ - navbar_name="hacking-release-version-numbering-scheme" \ - navtrail_previous_links="/hacking/>Hacking CDS Invenio " \ - navbar_select="hacking-release-version-numbering-scheme" - -
Version <: print generate_pretty_revision_date_string('$Id$'); :> - -
-CDS Invenio uses the classical major.minor.patchlevel release version -numbering scheme that is commonly used in the GNU/Linux world and -elsewhere. Each release is labelled by - - major.minor.patchlevel - -release version number. For example, a release version 4.0.1 means: - - 4 - 4th major version, i.e. the whole system has been already - 4th times either fully rewritten or at least in its very - essential components. The upgrade from one major version - to another may be rather hard, may require new prerequisite - technologies, full data dump, reload and reindexing, as - well as other major configuration adapatations, possibly - with an important manual intervention. - - 0 - 0th minor version, i.e. the first minor release of the 4th - major rewrite. (Increments go 4.1, 4.2, ... 4.9, 4.10, - 4.11, 4.12, ... until some important rewrite is done, - e.g. the database philosophy dramatically changes, leading - to a non-trivial upgrade, and we have 5.0.) The upgrade - from one minor version to another may be laborious but is - relatively painless, in that some table changes and data - manipulations may be necessary but they are somewhat - smaller in nature, easier to grasp, and possibly done by an - automated script. - - 1 - 1st patch level to 4.0, fixing bugs in 4.0.0 but not adding - any substantially new functionality. That is, the only new - functionality that is added is that of a `bug fix' nature. - The upgrade from one patch level to another is usually - straightforward. - - (Packages often seem to break this last rule, e.g. Linux - kernel adopting new important functionality (such as - ReiserFS) within the stable 2.4.x branch. It can be easily - seen that it is somewhat subjective to judge what is - qualitatively more like a minor new functionality and what - is more like a patch to the existing behaviour. We have - tried to quantify these notions with respect to whether - table structure and/or technology change require small or - large upgrade jobs and eventual manual efforts.) - -So, if we have a version 4.3, a bug fix would mean to release 4.3.1, -some minor new functionality and upgrade would mean to release 4.4, -some important database structure rewrite or an imaginary exchange of -Python for Common Lisp would mean to release 5.0, etc. - -In addition, the two-branch release policy is adopted: - - a) stable branch - releases in the stable branch are numbered with - even minor version number, like 0.2, 0.4, etc. These releases - are usually well tested. The configuration files and features - usually don't change often from one release to another. The - release frequency is low. - - b) development branch - releases in the development branch are - number with the odd minor version number, like 0.1, 0.3, etc. - These releases are more experimental and may be less tested than - the stable ones. The configuration files and features change - more rapidly from one release to another. The release frequency - is higher. - -It can be seen that the above scheme is somewhat similar to the Linux -kernel version numbering scheme. - -Currently, CDS Invenio 0.0.9 represents the stable branch release and -0.1.0 the development branch release. We are going to frequently -update it to provide 0.1.1, 0.1.2, etc as the currently missing admin -functionality is being added into the development branch, until later -on, when some release, say 0.1.8, will achieve a status of -satisfaction, at which point we release it as the next stable version -(0.2 or 1.0), and start a new development branch (0.3 or 1.1). - -- end of file - -diff --git a/modules/webhelp/web/hacking/style.html.wml b/modules/webhelp/web/hacking/style.html.wml deleted file mode 100644 index 26e2957c6..000000000 --- a/modules/webhelp/web/hacking/style.html.wml +++ /dev/null @@ -1,216 +0,0 @@ -## -*- mode: html; coding: utf-8; -*- -## $Id$ - -## This file is part of CDS Invenio. -## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. -## -## CDS Invenio is free software; you can redistribute it and/or -## modify it under the terms of the GNU General Public License as -## published by the Free Software Foundation; either version 2 of the -## License, or (at your option) any later version. -## -## CDS Invenio is distributed in the hope that it will be useful, but -## WITHOUT ANY WARRANTY; without even the implied warranty of -## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. -## -## You should have received a copy of the GNU General Public License -## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., -## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. - -#include "cdspage.wml" \ - title="Coding Style" \ - navbar_name="hacking-coding-style" \ - navtrail_previous_links="/hacking/>Hacking CDS Invenio " \ - navbar_select="hacking-coding-style" - -
Version <: print generate_pretty_revision_date_string('$Id$'); :> - -
-A brief description of things we strive at, more or less unsuccessfully. - -1. Packaging - - We use the classical GNU Autoconf/Automake approach, for tutorial - see e.g. Learning the GNU development tools or the AutoBook. - -2. Modules - - CDS Invenio started as a set of pretty independent modules developed by - independent people with independent styles. This was even more - pronounced by the original use of many different languages - (e.g. Python, PHP, Perl). Now the CDS Invenio code base is striving to - use Python everywhere, except in speed-critical parts when a - compiled language such as Common Lisp may come to the rescue in the - near future. - - When modifying an existing module, we propose to strictly continue - using whatever coding style the module was originally written into. - When writing new modules, we propose to stick to the - below-mentioned standards. - - The code integration across modules is happening, but is slow. - Therefore, don't be surprised to see that there is a lot of room to - refactor. - -3. WML/ePerl/etc - - This is not so important, because not many lines-of-code were - written in WML/ePerl. We prefer to loosely follow the GNU way, as - always. - -4. Python - - We aim at following recommendations from PEP 8, although - the existing code surely do not fulfil them here and there. - The code indentation is done via spaces only, please do not use - tabs. One tab counts as four spaces. Emacs users can look into - our cdsware.el for inspiration. - - All the Python code should be extensively documented via - docstrings, so you can always run pydoc file.py to peruse the - file's documentation in one simple go. - - Do not forget to run pylint on your code to check for errors like - uninitialized variables and to improve its quality and conformance - to the coding standard. If you develop in Emacs, run M-x pylint - RET on your buffers frequently. Read and implement pylint - suggestions. (Note that using lambda and friends may lead to false - pylint warnings. You can switch them off by putting block comments - of the form ``# pylint: disable-msg=C0301''.) - - Do not forget to run pychecker on your code either. It is another - source code checker that catches some situations better and some - situations worse than pylint. If you develop in Emacs, run C-c C-w - (M-x py-pychecker-run RET) on your buffers frequently. (Note that - using psyco on classes may lead to false pychecker warnings.) - - You can check the kwalitee of your code by running ``python - modules/miscutil/lib/kwalitee.py *.py'' on your files. You can - also check the code kwalitee across all the modules by running - ``make kwalitee-check'' in the main source directory. - - Do not hardcode magic constants in your code. Every magic string or - a number should be put into accompanying file_config.py with - symbol name beginning by cfg_modulename_*. - - Clearly separate interfaces from implementation. Document your - interfaces. Do not expose to other modules anything that does not - have to be exposed. Apply principle of least information. - - Create as few new library files as possible. Do not create many - nested files in nested modules; rather put all the lib files in one - dir with bibindex_foo and bibindex_bar names. - - Use imperative/functional paradigm rather then OO. If you do use - OO, then stick to as simple class hierarchy as possible. Recall - that method calls and exception handling in Python are quite - expensive. - - Use rather the good old foo_bar naming convention for symbols (both - variables and function names) instead of fooBar CaMelCaSe - convention. (Except for Class names where UppercaseSymbolNames are - to be used.) - - Pay special attention to name your symbols descriptively. Your - code is going to be read and work with by others and its symbols - should be self-understandable without any comments and without - studying other parts of the code. For example, use proper English - words, not abbreviations that can be misspelled in many a way; use - words that go in pair (e.g. create/destroy, start/stop; never - create/stop); use self-understandable symbol names - (e.g. list_of_file_extensions rather than list2); never misname - symbols (e.g. score_list should hold the list of scores and nothing - else - if in the course of development you change the semantics of - what the symbol holds then change the symbol name too). Do not be - afraid to use long descriptive names; good editors such as Emacs - can tab-complete symbols for you. - - When hacking module A, pay close attention to ressemble existing - coding convention in A, even if it is legacy-weird and even if we - use a different technique elsewhere. (Unless the whole module A is - going to be refactored, of course.) - - Speed-critical parts should be profiled with pyprof. Do not forget - to use tricks like psyco. - - The code should be well tested before committed. Testing is an - integral part of the development process. Test along as you - program. The testing process should be automatized via our unit - test and regression test suite infrastructures. Please read the - test suite strategy to know more. - - Python promotes writing clear, readable, easily maintainable code. - Write it as such. Recall Albert Einstein's ``Everything should be - made as simple as possible, but not simpler''. Things should be - neither overengineered nor oversimplified. - - Recall principles Unix is built upon. As summarized by Eric - S. Reymond's TAOUP: - - Rule of Modularity: Write simple parts connected by clean interfaces. - Rule of Clarity: Clarity is better than cleverness. - Rule of Composition: Design programs to be connected with other programs. - Rule of Separation: Separate policy from mechanism; separate interfaces from engines. - Rule of Simplicity: Design for simplicity; add complexity only where you must. - Rule of Parsimony: Write a big program only when it is clear by demonstration that nothing else will do. - Rule of Transparency: Design for visibility to make inspection and debugging easier. - Rule of Robustness: Robustness is the child of transparency and simplicity. - Rule of Representation: Fold knowledge into data, so program logic can be stupid and robust. - Rule of Least Surprise: In interface design, always do the least surprising thing. - Rule of Silence: When a program has nothing surprising to say, it should say nothing. - Rule of Repair: Repair what you can -- but when you must fail, fail noisily and as soon as possible. - Rule of Economy: Programmer time is expensive; conserve it in preference to machine time. - Rule of Generation: Avoid hand-hacking; write programs to write programs when you can. - Rule of Optimization: Prototype before polishing. Get it working before you optimize it. - Rule of Diversity: Distrust all claims for one true way. - Rule of Extensibility: Design for the future, because it will be here sooner than you think. - - or the golden rule that says it all: ``keep it simple''. - - For more hints, thoughts, and other ruminations on programming, - see my Vademecum and Essays. - -5. PHP - - We are moving slowly away out of PHP so that there may be several - practices in place with the PHP code present in CDS Invenio. Usually - this is consistent within modules but inconsistent across modules. - For example, some old code used Emacs' perl-mode, following - traditional K&R C style, while some other old code tried to stick - to PEAR recommendations. - -6. MySQL - - Table naming policy is, roughly and briefly: - - - "foo": table names in lowercase, without prefix, used by me - for WebSearch - - - "foo_bar": underscores represent M:N relationship between - "foo" and "bar", to tie the two tables together - - - "bib*": many tables to hold the metadata and relationships - between them - - - "idx*": idx is the table name prefix used by BibIndex - - - "rnk*": rnk is the table name prefix used by BibRank - - - "flx*": flx is the table name prefix used by FlexElink (also known as - BibFormat) - - - "sbm*": sbm is the table name prefix used by WebSubmit - - - "sch*": sch is the table name prefix used by BibSched - - - "collection*": many tables to describe collections and search - interface pages - - - "user*" : many tables to describe personal features (baskets, - alerts) - -- end of file - - -diff --git a/modules/webhelp/web/hacking/test-suite-strategy.webdoc b/modules/webhelp/web/hacking/test-suite.webdoc similarity index 100% rename from modules/webhelp/web/hacking/test-suite-strategy.webdoc rename to modules/webhelp/web/hacking/test-suite.webdoc diff --git a/modules/webhelp/web/hacking/testsuite.html.wml b/modules/webhelp/web/hacking/testsuite.html.wml deleted file mode 100644 index 7ed647d82..000000000 --- a/modules/webhelp/web/hacking/testsuite.html.wml +++ /dev/null @@ -1,494 +0,0 @@ -## $Id$ - -## This file is part of CDS Invenio. -## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007 CERN. -## -## CDS Invenio is free software; you can redistribute it and/or -## modify it under the terms of the GNU General Public License as -## published by the Free Software Foundation; either version 2 of the -## License, or (at your option) any later version. -## -## CDS Invenio is distributed in the hope that it will be useful, but -## WITHOUT ANY WARRANTY; without even the implied warranty of -## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU -## General Public License for more details. -## -## You should have received a copy of the GNU General Public License -## along with CDS Invenio; if not, write to the Free Software Foundation, Inc., -## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. - -#include "cdspage.wml" \ - title="Test Suite Strategy" \ - navbar_name="hacking-test-suite-strategy" \ - navtrail_previous_links="/hacking/>Hacking CDS Invenio " \ - navbar_select="hacking-test-suite-strategy" - -
Version <: print generate_pretty_revision_date_string('$Id$'); :> - -
This documents presents guidelines for unit testing and regression -testing homogenisation throughout all CDS Invenio modules. - -
Testing is an important coding activity. Most authors believe that -writing test cases should take between 10% and 30% of the project -time. But, even with such a large fraction, don't put too much belief -on such a testing. It cannot find bugs that aren't tested for. So, -while testing is an important activity inherent to safe software -development practices, it cannot become a substitute for pro-active -bug hunting, source code inspection, and bugfree-driven development -approach from the start. - -
Testing should happen alongside with coding. If you write a -function, immediately load it into your toplevel, evaluate its -definition, and call it for a couple of arguments to make sure the -function works as expected. If not, then change the function -definition, re-evaluate it, re-call it, etc. Dynamic languages with -interactive toplevel such as Common Lisp or Python makes this easy for -you. Dynamic redefinition capabilities (full in Common Lisp, partial -in Python) are very programmer-friendly in this respect. If your test -cases are interesting to be kept, then keep them in a test file. -(It's almost all the time a good idea to store them in the test file, -since you cannot predict whether you won't want to change something in -the future.) We'll see below how to store your tests in a test file. - -
When testing, it is nice to know some rules of thumb, like: check -your edge cases (e.g. null array), check atypical input values -(e.g. laaarge array instead of typically 5-6 elements only), check -your termination conditions, ask whether your arguments have already -been safe-proofed or whether it is in your mandate to check them, -write a test case for each `if-else' branch of the code to explore all -the possibilites, etc. Another interesting rule of thumb is the bug -frequency distribution. Experience has shown that the bugs tend to -cluster. If you discover a bug, there are chances that other bugs are -in the neighborhood. The famous 80/20 rule of thumb applies here too: -about 80% of bugs are located in about 20% of the code. Another rule -of thumb: if you find a bug caused by some coding practice pattern -thay may be used elsewhere too, look and fix other pattern instances. - -
In a nutshell, the best advice to write bug-free code is: think -ahead. Try to prepare in advance for unusual usage scenarios, to -foresee problems before they happen. Don't rely on typical input and -typical usage scenarios. Things have a tendency to become atypical -one day. Recall that testing is necessary, but not sufficient, to -write good code. Therefore, think ahead! - -
Core functionality, such as the hit set intersection for the search -engine, or the text input manipulating functions of the BibConvert -language, should come with a couple of test cases to assure proper -behaviour of the core functionality. The test cases should cover -typical input (e.g. hit set corresponding to the query for ``ellis''), -as well as the edge cases (e.g. empty/full hit set) and other unusual -situations (e.g. non-UTF-8 accented input for BibConvert functions to -test a situation of different number of bytes per char). - -
The test cases should be written for most important core -functionality. Not every function or class in the code is to be -throughly tested. Common sense will tell. - -
Unit test cases are free of side-effects. Users should be able to -run them on production database without any harm to their data. This -is because the tests test ``units'' of the code, not the application -as such. If the behaviour of the function you would like to test -depends on the status of the database, or some other parameters that -cannot be passed to the function itself, the unit testing framework is -not suitable for this kind of situation and you should use the -regression testing framework instead (see below). - -
For more information on Pythonic unit testing, see the -documentation to the unittest module at http://docs.python.org/lib/module-unittest.html. -For a tutorial, see for example http://diveintopython.org/unit_testing/. - -
Each core file that is located in the lib directory (such as the
-webbasketlib.py
in the example above) should come with a
-testing file where the test cases are stored. The test file is to be
-named identically as the lib file it tests, but with the suffix
-_tests
(in our example,
-webbasketlib_tests.py
).
-
-
The test cases are written using Pythonic unittest TestCase class. -An example for testing search engine query parameter washing function: - -
-- --$ cat /opt/cds-invenio/lib/python/invenio/search_engine_tests.py -[...] -import search_engine -import unittest - -class TestWashQueryParameters(unittest.TestCase): - """Test for washing of search query parameters.""" - - def test_wash_url_argument(self): - """search engine washing of URL arguments""" - self.assertEqual(1, search_engine.wash_url_argument(['1'],'int')) - self.assertEqual("1", search_engine.wash_url_argument(['1'],'str')) - self.assertEqual(['1'], search_engine.wash_url_argument(['1'],'list')) - self.assertEqual(0, search_engine.wash_url_argument('ellis','int')) - self.assertEqual("ellis", search_engine.wash_url_argument('ellis','str')) - self.assertEqual(["ellis"], search_engine.wash_url_argument('ellis','list')) - self.assertEqual(0, search_engine.wash_url_argument(['ellis'],'int')) - self.assertEqual("ellis", search_engine.wash_url_argument(['ellis'],'str')) - self.assertEqual(["ellis"], search_engine.wash_url_argument(['ellis'],'list')) -[...] --
In addition, each test file is supposed to define a
-create_test_suite()
function that will return test suite
-with all the tests available in this file:
-
-
-- --$ cat /opt/cds-invenio/lib/python/invenio/search_engine_tests.py -[...] -def create_test_suite(): - """Return test suite for the search engine.""" - return unittest.TestSuite((unittest.makeSuite(TestWashQueryParameters,'test'), - unittest.makeSuite(TestStripAccents,'test'))) -[...] --
This will enable us to later include this file into
-testsuite
executable:
-
-
-- --$ cat ~/src/cds-invenio/modules/miscutil/bin/testsuite.in -[...] -from invenio import search_engine_tests - from invenio import bibindex_engine_tests - -def create_all_test_suites(): - """Return all tests suites for all CDS Invenio modules.""" - return unittest.TestSuite((search_engine_tests.create_test_suite(), - bibindex_engine_tests.create_test_suite())) -[...] --
In this way, all the test cases defined in the file
-search_engine_tests.py
will be executed when the global
-testcase
executable is called.
-
-
Note that it may be time-consuming to run all the tests in one go.
-If you are interested in running tests only on a certain file (say
-search_engine_tests.py
), then launch:
-
-
-- --$ python /opt/cds-invenio/lib/python/invenio/search_engine_tests.py --
For full-scale examples, you may follow
- CDS Invenio test suite can be run in the source directory:
-
- The informative output is of the form:
-
- The test suite compliance should be checked before each CVS commit.
-(And, obviously, double-checked before each CDS Invenio release.)
-
- In addition to the above-mentioned unit testing of important
-functions, a regression testing should ensure that the overall
-application functionality is behaving well and is not altered by code
-changes. This is especially important if a bug had been previously
-found. Then a regression test case should be written to assure that
-it will never reappear. (It also helps to scan the neighborhood of
-the bug, or the whole codebase for occurrences of the same kind of
-bug, see the 80/20 thumb rule cited above.)
-
- Moreover, the regression test suite should be used when the
-functionality of the item we would like to test depends on
-extra-parametrical status, such as the database content. Also, the
-regression framework is suitable for testing the web pages overall
-behaviour. (In extreme programming, the regression testing is called
-acceptance testing, the name that evolved from previous
-functionality testing.)
-
- Within the framework of the regression test suite, we have liberty
-to alter database content, unlike that of the unit testing framework.
-We can also simulate the web browser in order to test web
-applications.
-
- As an example of a regression test, we can test whether the web
-pages are alive; whether searching for Ellis in the demo site produces
-indeed 12 records; whether searching for aoeuidhtns produces no hits
-but the box of nearest terms, and with which content; whether
-accessing the Theses collection page search prompts an Apache password
-prompt; whether the admin interface is really accessible only to
-admins or also to guests, etc.
-
- For more information on regression testing, see for example http://c2.com/cgi/wiki?RegressionTesting.
-
- Regression tests are written per application (or sub-module) in
-files named like When writing regression tests, you can assume that the site is in
-the fresh demo mode (Atlantis Institute of Fictive Science). You can
-also safely write not only database-read-only tests, but you can also
-safely insert/update/delete into/from the database whatever values you
-need for testing. Users are warned prior to running the regression
-test suite about its possibly destructive side-effects. (See below.)
-Therefore you can create users, create user groups, attach users to
-groups to test the group joining process etc, as needed.
-
- For testing web pages using GET arguments, you can take advantage
-of the following helper function:
-
-search_engine_tests.py
and other
_tests.py
-files in the source distribution.
-
-2.3 Running unit tests
-
-
-
-
-or anytime after the installation:
-
-
-$ make test
-
-
-
-
-The ``testsuite'' executable will run all available unit tests
-provided with CDS Invenio.
-
-
-$ /opt/cds-invenio/bin/testsuite
-
-
-
-
-In case of problems you will see failures like:
-
-
-$ make test
-CDS Invenio v0.3.2.20040519 test suite results:
-===========================================
-search engine washing of query patterns ... ok
-search engine washing of URL arguments ... ok
-search engine stripping of accented letters ... ok
-bibindex engine list union ... ok
-
-----------------------------------------------------------------------
-Ran 4 tests in 0.121s
-
-OK
-
-
-
-
-
-CDS Invenio v0.3.2.20040519 test suite results:
-===========================================
-search engine washing of query patterns ... FAIL
-search engine washing of URL arguments ... ok
-search engine stripping of accented letters ... ok
-bibindex engine list union ... ok
-
-======================================================================
-FAIL: search engine washing of query patterns
-----------------------------------------------------------------------
-Traceback (most recent call last):
- File "/opt/cds-invenio/lib/python/invenio/search_engine_tests.py", line 25, in test_wash_pattern
- self.assertEqual("ell*", search_engine.wash_pattern('ell*'))
- File "/usr/lib/python2.3/unittest.py", line 302, in failUnlessEqual
- raise self.failureException, \
-AssertionError: 'ell*' != 'ell'
-
-----------------------------------------------------------------------
-Ran 4 tests in 0.091s
-
-FAILED (failures=1)
-
-3. Regression testing
-
-3.1 Regression testing philosophy
-
-3.2 Writing regression tests
-
-websearch_regression_tests.py
or
-websubmitadmin_regression_tests.py
.
-
-
-
-$ cat /opt/cds-invenio/lib/python/invenio/testutils.py
-[...]
-def test_web_page_content(url, username="guest", expected_text="