A brief description of things we strive at, more or less unsuccessfully.
1. Packaging
We use the classical GNU Autoconf/Automake approach, for tutorial
see e.g. <a href="http://www.amath.washington.edu/~lf/tutorials/autoconf/tutorial_toc.html">Learning the GNU development tools</a> or the <a href="http://sources.redhat.com/autobook/autobook/autobook_toc.html">AutoBook</a>.
2. Modules
CDSware started as a set of pretty independent modules developed by
independent people with independent styles. This was even more
pronounced by the original use of many different languages
(e.g. Python, PHP, Perl). Now the CDSware code base is striving to
use Python everywhere, except in speed-critical parts when a
compiled language such as Common Lisp may come to the rescue in the
near future.
When modifying an existing module, we propose to strictly continue
using whatever coding style the module was originally written into.
When writing new modules, we propose to stick to the
below-mentioned standards.
The code integration across modules is happening, but is slow.
Therefore, don't be surprised to see that there is a lot of room to
refactor.
3. WML/ePerl/etc
This is not so important, because not many lines-of-code were
written in WML/ePerl. We prefer to loosely follow the GNU way, as
always.
4. Python
We aim at following recommendations from <a
href="http://www.python.org/peps/pep-0008.html">PEP 8</a>, although
the existing code surely do not fulfil them here and there.
The most easily notable exception from PEP 8 is perhaps the line
width, that usually exceeds 72 characters per line in our sources,
as we tend to use fullscreen editors. The code indentation is done
via spaces only, please do not use tabs. One tab counts as four
spaces. Emacs users can look into our <a
href="cdsware.el">cdsware.el</a> for inspiration.
All the Python code should be extensively documented via
docstrings, so you can always try to run pydoc file.py to
peruse a module's documentation in one go.
Do not forget to run pylint on your code to check errors like
uninitialized variables and to improve its quality and conformance
to coding standards. The typical incantation may be:
$ cd <LIBDIR>/python/cdsware
$ pylint --max-line-length=160 oai_repository.py | most
Read and implement pylint suggestions.
Do not hardcode magic constants in your code. Every magic string or
a number should be put into accompanying file_config.py with
symbol name beginning by cfg_modulename_*.
Clearly separate interfaces from implementation. Document your
interfaces. Do not expose to other modules anything that does not
have to be exposed. Apply principle of least information.
Create as few new library files as possible. Do not create many
nested files in nested modules; rather put all the lib files in one
dir with bibindex_foo and bibindex_bar names.
Use imperative/functional paradigm rather then OO. If you do use
OO, then stick to as simple class hierarchy as possible. Recall
that method calls and exception handling in Python are quite
expensive.
Use rather the good old foo_bar naming convention for symbols (both
variables and function names) instead of fooBar CaMelCaSe
convention. (Except for Class names where UppercaseSymbolNames are
to be used.)
Pay special attention to name your symbols descriptively. Your
code is going to be read and work with by others and its symbols
should be self-understandable without any comments and without
studying other parts of the code. For example, use proper English
words, not abbreviations that can be misspelled in many a way; use
words that go in pair (e.g. create/destroy, start/stop; never
create/stop); use self-understandable symbol names
(e.g. list_of_file_extensions rather than list2); never misname
symbols (e.g. score_list should hold the list of scores and nothing
else - if in the course of development you change the semantics of
what the symbol holds then change the symbol name too). Do not be
afraid to use long descriptive names; good editors such as Emacs
can tab-complete symbols for you.
When hacking module A, pay close attention to ressemble existing
coding convention in A, even if it is legacy-weird and even if we
use a different technique elsewhere. (Unless the whole module A is
going to be refactored, of course.)
All core code should come with a suitable set of test cases, see
the <a href="testsuite.html">test suite strategy</a> for details.
Speed-critical parts should be profiled with pyprof. Do not forget
to use tricks like psyco.
For more hints, thoughts, and other ruminations, see <a
href="http://simko.info/vademecum/">Vademecum and Essays</a>.
5. PHP
We are moving slowly away out of PHP so that there may be several
practices in place with the PHP code present in CDSware. Usually
this is consistent within modules but inconsistent across modules.
For example, some old code used Emacs' perl-mode, following
traditional K&R C style, while some other old code tried to stick
to <a href="http://pear.php.net/manual/en/standards.php">PEAR recommendations</a>.
6. MySQL
Table naming policy is, roughly and briefly:
- "foo": table names in lowercase, without prefix, used by me
for WebSearch
- "foo_bar": underscores represent M:N relationship between
"foo" and "bar", to tie the two tables together
- "bib*": many tables to hold the metadata and relationships
between them
- "idx*": idx is the table name prefix used by BibIndex
- "rnk*": rnk is the table name prefix used by BibRank
- "flx*": flx is the table name prefix used by FlexElink (also known as
BibFormat)
- "sbm*": sbm is the table name prefix used by WebSubmit
- "sch*": sch is the table name prefix used by BibSched
- "collection*": many tables to describe collections and search
interface pages
- "user*" : many tables to describe personal features (baskets,