Page MenuHomec4science

coding-style.webdoc
No OneTemporary

File Metadata

Created
Sat, Nov 16, 05:23

coding-style.webdoc

## -*- mode: html; coding: utf-8; -*-
## $Id$
## This file is part of CDS Invenio.
## Copyright (C) 2002, 2003, 2004, 2005, 2006, 2007, 2008 CERN.
##
## CDS Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## CDS Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDS Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
<!-- WebDoc-Page-Title: Coding Style -->
<!-- WebDoc-Page-Navtrail: <a class="navtrail" href="<CFG_SITE_URL>/help/hacking">Hacking CDS Invenio</a> -->
<!-- WebDoc-Page-Revision: $Id$ -->
<pre>
A brief description of things we strive at, more or less unsuccessfully.
1. Packaging
We use the classical GNU Autoconf/Automake approach, for tutorial
see e.g. <a href="http://www.amath.washington.edu/~lf/tutorials/autoconf/tutorial_toc.html">Learning the GNU development tools</a> or the <a href="http://sources.redhat.com/autobook/autobook/autobook_toc.html">AutoBook</a>.
2. Modules
CDS Invenio started as a set of pretty independent modules developed by
independent people with independent styles. This was even more
pronounced by the original use of many different languages
(e.g. Python, PHP, Perl). Now the CDS Invenio code base is striving to
use Python everywhere, except in speed-critical parts when a
compiled language such as Common Lisp may come to the rescue in the
near future.
When modifying an existing module, we propose to strictly continue
using whatever coding style the module was originally written into.
When writing new modules, we propose to stick to the
below-mentioned standards.
The code integration across modules is happening, but is slow.
Therefore, don't be surprised to see that there is a lot of room to
refactor.
3. WML/ePerl/etc
This is not so important, because not many lines-of-code were
written in WML/ePerl. We prefer to loosely follow the GNU way, as
always.
4. Python
We aim at following recommendations from <a
href="http://www.python.org/peps/pep-0008.html">PEP 8</a>, although
the existing code surely do not fulfil them here and there.
The code indentation is done via spaces only, please do not use
tabs. One tab counts as four spaces. Emacs users can look into
our <a href="<CFG_SITE_URL>/hacking/cdsware.el">cdsware.el</a> for inspiration.
All the Python code should be extensively documented via
docstrings, so you can always run pydoc file.py to peruse the
file's documentation in one simple go.
Do not forget to run pylint on your code to check for errors like
uninitialized variables and to improve its quality and conformance
to the coding standard. If you develop in Emacs, run M-x pylint
RET on your buffers frequently. Read and implement pylint
suggestions. (Note that using lambda and friends may lead to false
pylint warnings. You can switch them off by putting block comments
of the form ``# pylint: disable-msg=C0301''.)
Do not forget to run pychecker on your code either. It is another
source code checker that catches some situations better and some
situations worse than pylint. If you develop in Emacs, run C-c C-w
(M-x py-pychecker-run RET) on your buffers frequently. (Note that
using psyco on classes may lead to false pychecker warnings.)
You can check the kwalitee of your code by running ``python
modules/miscutil/lib/kwalitee.py *.py'' on your files. You can
also check the code kwalitee across all the modules by running
``make kwalitee-check'' in the main source directory.
Do not hardcode magic constants in your code. Every magic string or
a number should be put into accompanying file_config.py with
symbol name beginning by cfg_modulename_*.
Clearly separate interfaces from implementation. Document your
interfaces. Do not expose to other modules anything that does not
have to be exposed. Apply principle of least information.
Create as few new library files as possible. Do not create many
nested files in nested modules; rather put all the lib files in one
dir with bibindex_foo and bibindex_bar names.
Use imperative/functional paradigm rather then OO. If you do use
OO, then stick to as simple class hierarchy as possible. Recall
that method calls and exception handling in Python are quite
expensive.
Use rather the good old foo_bar naming convention for symbols (both
variables and function names) instead of fooBar CaMelCaSe
convention. (Except for Class names where UppercaseSymbolNames are
to be used.)
Pay special attention to name your symbols descriptively. Your
code is going to be read and work with by others and its symbols
should be self-understandable without any comments and without
studying other parts of the code. For example, use proper English
words, not abbreviations that can be misspelled in many a way; use
words that go in pair (e.g. create/destroy, start/stop; never
create/stop); use self-understandable symbol names
(e.g. list_of_file_extensions rather than list2); never misname
symbols (e.g. score_list should hold the list of scores and nothing
else - if in the course of development you change the semantics of
what the symbol holds then change the symbol name too). Do not be
afraid to use long descriptive names; good editors such as Emacs
can tab-complete symbols for you.
When hacking module A, pay close attention to ressemble existing
coding convention in A, even if it is legacy-weird and even if we
use a different technique elsewhere. (Unless the whole module A is
going to be refactored, of course.)
Speed-critical parts should be profiled with pyprof. Do not forget
to use tricks like psyco.
The code should be well tested before committed. Testing is an
integral part of the development process. Test along as you
program. The testing process should be automatized via our unit
test and regression test suite infrastructures. Please read the
<a href="test-suite">test suite strategy</a> to know more.
Python promotes writing clear, readable, easily maintainable code.
Write it as such. Recall Albert Einstein's ``Everything should be
made as simple as possible, but not simpler''. Things should be
neither overengineered nor oversimplified.
Recall principles Unix is built upon. As summarized by Eric
S. Reymond's <a href="http://www.catb.org/esr/writings/taoup/html/ch01s06.html#id2877537">TAOUP</a>:
Rule of Modularity: Write simple parts connected by clean interfaces.
Rule of Clarity: Clarity is better than cleverness.
Rule of Composition: Design programs to be connected with other programs.
Rule of Separation: Separate policy from mechanism; separate interfaces from engines.
Rule of Simplicity: Design for simplicity; add complexity only where you must.
Rule of Parsimony: Write a big program only when it is clear by demonstration that nothing else will do.
Rule of Transparency: Design for visibility to make inspection and debugging easier.
Rule of Robustness: Robustness is the child of transparency and simplicity.
Rule of Representation: Fold knowledge into data, so program logic can be stupid and robust.
Rule of Least Surprise: In interface design, always do the least surprising thing.
Rule of Silence: When a program has nothing surprising to say, it should say nothing.
Rule of Repair: Repair what you can -- but when you must fail, fail noisily and as soon as possible.
Rule of Economy: Programmer time is expensive; conserve it in preference to machine time.
Rule of Generation: Avoid hand-hacking; write programs to write programs when you can.
Rule of Optimization: Prototype before polishing. Get it working before you optimize it.
Rule of Diversity: Distrust all claims for one true way.
Rule of Extensibility: Design for the future, because it will be here sooner than you think.
or the golden rule that says it all: ``keep it simple''.
For more hints, thoughts, and other ruminations on programming,
see my <a href="https://twiki.cern.ch/twiki/bin/view/CDS/Invenio">CDS Invenio Wiki</a>.
5. PHP
We are moving slowly away out of PHP so that there may be several
practices in place with the PHP code present in CDS Invenio. Usually
this is consistent within modules but inconsistent across modules.
For example, some old code used Emacs' perl-mode, following
traditional K&R C style, while some other old code tried to stick
to <a href="http://pear.php.net/manual/en/standards.php">PEAR recommendations</a>.
6. MySQL
Table naming policy is, roughly and briefly:
- "foo": table names in lowercase, without prefix, used by me
for WebSearch
- "foo_bar": underscores represent M:N relationship between
"foo" and "bar", to tie the two tables together
- "bib*": many tables to hold the metadata and relationships
between them
- "idx*": idx is the table name prefix used by BibIndex
- "rnk*": rnk is the table name prefix used by BibRank
- "fmt*": fmt is the table name prefix used by BibFormat
- "sbm*": sbm is the table name prefix used by WebSubmit
- "sch*": sch is the table name prefix used by BibSched
- "acc*": acc is the table name prefix used by WebAccess
- "bsk*": acc is the table name prefix used by WebBasket
- "msg*": acc is the table name prefix used by WebMessage
- "cls*": acc is the table name prefix used by BibClassify
- "sta*": acc is the table name prefix used by WebStat
- "jrn*": acc is the table name prefix used by WebJournal
- "collection*": many tables to describe collections and search
interface pages
- "user*" : many tables to describe personal features (baskets,
alerts)
- "hst*": tables related to historical versions of metadata and
fulltext files
- end of file -
</pre>

Event Timeline