R3127/
e98469d785bemaster

/

.gitignore
BlackDynamite/
BlackDynamiteConfig.cmake.in
CMakeLists.txt
COPYING.txt
MANIFEST.in
README.md
cmake/
docker/
example/
icon.png
pythontests/
scripts/
setup.py
src/
test/

README.md

BlackDynamite

Installation

Dependencies

bash
sudo apt-get install python-psycopg2
sudo apt-get install python-numpy
sudo apt-get install python-argcomplete

Installation of client side

The easiest is through pip, that needs first to be installed:

bash
sudo apt-get install python-pip

Then for a system wide installation (recommended):

bash
sudo pip install https://gitlab.com/ganciaux/blackdynamite.git

Then for a user scope installation:

bash
pip install --user  https://gitlab.com/ganciaux/blackdynamite.git

Getting the sources

You can clone the GIT repository:

bash
git clone https://gitlab.com/ganciaux/blackdynamite.git

Installing completion

To benefit the autocompletion for BlackDynamite the following steps are needed. You first need to install the argcomplete modules. Either by typing (Depending of your Ubuntu/Debian version) :

bash
sudo apt-get install python-argcomplete

or:

bash
sudo apt-get install python-pip
sudo pip install argcomplete

Then you must insert the following in your .bashrc

bash
eval "$(register-python-argcomplete getRunInfo.py)"
eval "$(register-python-argcomplete launchRuns.py)"
eval "$(register-python-argcomplete canYouDigIt.py)"
eval "$(register-python-argcomplete cleanRuns.py)"
eval "$(register-python-argcomplete updateRuns.py)"
eval "$(register-python-argcomplete enterRun.py)"
eval "$(register-python-argcomplete enterRun.py)"
eval "$(register-python-argcomplete saveBDStudy.py)"

Register hosts to BlackDynamite

In the .blackdynamite folder (in your home) you should add the servers where your databases are, with the option and information of your choice.

For each database you can add a file .bd of the name of the server (or an alias and specify the host inside:

bash
host = yourHost.domain.countryID

It is also recommended to specify the password of the database to avoid typing it when using auto-completion.

Here is an example of a valid blackdynamite config file:

bash
cat ~/.blackdynamite/lsmssrv1.epfl.ch.bd

bash
host = lsmssrv1.epfl.ch
password = XXXXXXXXX

Introduction and philosophy

Blackdynamite is merely a tool to help achieving a few things:

Launching a program repeatedly with varying parameters, to explore the chosen parametric space.

Collect and sort results of Small sizes benefiting from the power of modern databases.

Analyze the results by making requests to the associated databases.

Launching is made simple by allowing any executable to be launched. The set of directories will be generated and managed by BlackDynamite to prevent errors. Requests of any kind will then be made to the underlying database through friendly commands of BlackDynamite.

Collecting the results will be possible thanks to the Blackdynamite C/C++ and python API which will let you send results directly to the database and thus automatically sort them. This is extremely useful. However heavy data such as Paraview files or any other kind of data should not be pushed to the database for obvious performance issues.

Analysis of the results can be made easy thanks to Blackdynamite which can retrieve data information in the form of Numpy array to be used, analyzed or plotted thanks to the powerful and vast Python libraries such as Matplotlib and Scipy.

The construction of a BlackDynamite parametric study follows these steps:

Describing the parametric space
Creating jobs (specific points in the parametric space)
Creating runs (instances of the jobs)
Launching runs
Intrumenting the simulation to send results
Analyzing the results

Setting up a parametric study

Chose the parameters of the study

The first thing to do is to setup the table in the database associated with the study we want to perform. For this to be done you need, first of all, to list all the parameters that decide a specific case/computation. This parameters can be of simple types like string, integers, floats, etc. At current time no vectorial quantity can be considered as an input parameter. Once this list is done you need to create a script, usually named 'createDB.py' that will do this task. Let us examine such an example script.

Setting up blackdynamite python modules

First we need to set the python headers and to import the BlackDynamite modules by

python
#!/usr/bin/env python

import BlackDynamite as BD

Then you have to create a generic black dynamite parser and parse the system (including the connection parameters and credentials)

python
parser = BD.BDParser()
params = parser.parseBDParameters()

This mechanism allows to easily inherit from the parser mechanism of BlackDynamite, including the completion (if activated: see installation instructions). Then you can connect to the black dynamite database

python
base = BD.base.Base(**params)

Setting up of the parametric space: the jobs pattern

Then you have to define the parametric space (at present time, the parametric space cannot be changed once the study started: be careful with your choices). Any particular job is defined as a point in the parametric space. For instance, to create a job description and add the parameters with int, float or list parameters, you can use the following python sequence.

python
myjob_desc = BD.job.Job(base)

myjob_desc.types["param1"] = int
myjob_desc.types["param2"] = float
myjob_desc.types["param3"] = str

Important remark: Do not name your parameters like PostGreSQL keywords.

Setting up of the run space

Aside of the jobs, a run will represent a particular realisation (computation) of a job. To get clearer, the run will contain information of the machine it was run on, the executable version, or the number of processors employed. For instance creating the run pattern can be done with:

python
myruns_desc = run.Run(base)

myruns_desc.types["compiler"] = str

There are default entries to the description of runs. These are:

machine_name: the name of the machine where the run must be executed
job_id (integer): the ID of the running job
has_started (bool): flag to know whether the job has already started
has_finished (bool): flag to know whether the job has already finished
run_name (string): the name of the run
wait_id (int): The id of a run to wait before starting
start_time (TIMESTAMP): The start time for the run

Commit the changes to the database

Then you have to request for the creation of the database

python
base.createBase(myjob_desc,myruns_desc,**params)

You have to launch the script. As mentioned, all BlackDynamite scripts inherit from the parsing system. So that when needing to launch one of these codes, you can always claim for the valid keywords:

bash
./createDB.py --help

usage: createDB.py [-h] [--job_constraints JOB_CONSTRAINTS] [--study STUDY]
                   [--port PORT] [--host HOST] [--user USER] [--truerun]
                   [--run_constraints RUN_CONSTRAINTS] [--yes] [--password]
                   [--list_parameters] [--BDconf BDCONF]
                   [--binary_operator BINARY_OPERATOR]

BlackDynamite option parser

optional arguments:
  -h, --help            show this help message and exit

General:
  --yes                 Answer all questions to yes. (default: False)
  --binary_operator BINARY_OPERATOR
                        Set the default binary operator to make requests to
                        database (default: and)

BDParser:
  --job_constraints JOB_CONSTRAINTS
                        This allows to constraint run selections by job
                        properties (default: None)
  --study STUDY         Specify the study from the BlackDynamite database.
                        This refers to the schemas in PostgreSQL language
                        (default: None)
  --port PORT           Specify data base server port (default: None)
  --host HOST           Specify data base server address (default: None)
  --user USER           Specify user name to connect to data base server
                        (default: None)
  --truerun             Set this flag if you want to truly perform the action
                        on base. If not set all action are mainly dryrun
                        (default: False)
  --run_constraints RUN_CONSTRAINTS
                        This allows to constraint run selections by run
                        properties (default: None)
  --password            Flag to request prompt for typing password (default:
                        False)
  --list_parameters     Request to list the possible job/run parameters
                        (default: False)
  --BDconf BDCONF       Path to a BlackDynamite file (*.bd) configuring
                        current optons (default: None)

An important point is that most of the actions are only applied when the 'truerun' flag is set. Also, you always have to mention the host and the study you are working on (all scripts can apply to several studies). To launch the script and create the database you should launch:

bash
./createDB.py --host lsmssrv1.epfl.ch --study MysuperCoolStudy --truerun

Creating the jobs

The goal of the parametric study is to explore a subpart of the parametric space. We need to create jobs that are the points to explore. This script is usually named 'createJobs.py'.

We need to write a python script to generate this set of jobs. We start by setting the modules and the parser as for the 'createDB.py' script. Then we need to create job object:

bash
job = job.Job(base)

It is up to us to decide the values to explore. for convenience, it is possible to insert ranges of values:

bash
job["param1"]             = 10
job["param2"]             = [3.14,1.,2.]
job["param3"]             = 'toto'

This will create 3 jobs since we provided a range of values for the second parameter. The actual creation is made by calling:

python
base.createParameterSpace(job)

Launching the script is made with:

python
./createJobs.py --host lsmssrv1.epfl.ch --study test --truerun

Creating the runs and launching them

At this point the jobs are in the database. You need to create runs that will precise the conditions of the realization of the jobs. For example the machine onto which the job will run, path dependent information, executable information and others. We have to write the last script, usually named 'createRuns.py' to specify run creations.

Again we start with the modules. However this time, we can use another parser class more adapted to the manipulation of runs:

python
parser = BD.RunParser()
params = parser.parseBDParameters()
base = BD.Base(**params)

The default parameters for runs will then be automatically included in the parameters.

python
myrun = run.Run(base)

Some of the standard parameters might have been parsed directly by the RunParser, so that we have to forward them to the Run object:

python
myrun.setEntries(params)

A run now specify what action to perform to realize the job. Usually, an end-user has a script(s) and wish to attach it to the run. To attach a file you can for instance do:

python
myrun.addConfigFiles(['file1','file2','launch.sh'])

Then, one has to specify which of these files is the entry point:

python
myrun.setExecFile("launch.sh")

Finally, we have to create Run objects and attach them to jobs. The very first task is to claim the jobs from the database.

To that end the object JobSelector shall be your friend:

python
jobSelector = BD.JobSelector(base)
job_list = jobSelector.selectJobs()

This will return a job list that you can loop through and attach the runs to:

python
for j in job_list:
    myrun['compiler'] = 'gcc'
    myrun.attachToJob(j)

Everything should then be committed to the database:

python
if params["truerun"] is True: base.commit()

To create the run one should eventually launch the script by typing:

bash
./createRuns.py --host lsmssrv1.epfl.ch --study test --machine_name lsmspc41 --run_name toto --nproc int  --truerun

The runs are eventually launched using the tool 'launchRuns.py'.

bash
./launchRuns.py --host lsmssrv1.epfl.ch --study test --outpath /home/user/ --truerun (--nruns int)

Accessing and manipulating the database

The runs can actually be controlled in the database with the tool 'getRunInfo.py', and one can go to the run folder with 'enterRun.py'. The runs are then launched using the tool 'launchRuns.py'.

bash
./getRunInfo.py --host lsmssrv1.epfl.ch --study test
./enterRun.py --host lsmssrv1.epfl.ch --study test --run_id ID

The status of the run can be manually modified using the command 'cleanRuns.py', the default status is CREATED (it can be turned to delete)

bash
./cleanRuns.py --host lsmssrv1.epfl.ch --study test (--runid ID) --truerun (--delete)

The status and the other run parameters (e.g. the compiler in the example file) can also be modified with 'updateRuns.py'. This can be done in the executed scrip to automatically set the selected parameter

bash
updateRuns.py --host lsmssrv1.epfl.ch --study test --updates 'state = toto' --truerun

The function 'canYouDigIt.py' is an example of how to collect data in the runs to draw graphs. Example to plot the crack length in function of the time for different sigma_c (the study parameter):

bash
canYouDigIt.py --host lsmssrv1.epfl.ch --study test --quantity time, crack_length --using %0.x:%1.y --xlabel 'time' --ylabel 'crack_length' --legend 'sigma_c = %j.sigma_c'

Eventually, the database can be saved in .zip format to be exported and used offline with 'saveBDStudy.py'.

Instrumenting a C++ simulation code

Within you program you need a pusher correctly initialized in order to push data to the database. The 'test_blackdynamite.cc' is an example of such pusher.

First *blackdynamite* includes are required:

cpp
#include "blackdynamite.hh"

Then you need to create a Pusher object and initialize it.

cpp
BlackDynamite::RunManager bd;
bd.startRun();

The constructor by default reads environment variables to get the database connection and schema informations:

RUN_ID: the identifier of the runid
SCHEMA: the schema where the parametric study is to be stored
HOST: the database hostname

Then in the places where values are created you push the values to the database

cpp
bd.push(val1,"quantity1",step);
bd.push(val2,"quantity2",step);

Step is a stage identifier. It can be the step index within an explicit loop, or a within a convergence descent or whatever you whish. It will serve later to compare quantity entries.

Finally, when the job ended the following call inform the database that the run is finished:

cpp
bd.endRun();

Instrumenting a Python simulation code

Within your program you need a run object to push data to the database. This is done by selecting the run from the 'run_id' (usually passed as parameter).

python
parser = BD.RunParser()
params = parser.parseBDParameters()
# params['run_id'] should exist
mybase = BD.Base(**params)
runSelector = BD.RunSelector(mybase)
myrun = runSelector(params)

In order to have time entries for run times, the 'start()' and 'finish()' of the run need to be called.

python
myrun.start()
# ...
# Important stuff
# ...
myrun.finish()

Pushing data is can be done with 'pushVectorQuantity()' and 'pushScalarQuantity()'.

python
myrun.pushVectorQuantity(vector_quantity, step, "quantity_id", is_integer=False)
myrun.pushScalarQuantity(scalar_quantity, step, "quantity_id", is_integer=False)

Fecthing the results

Under construction...

Installation of the server side: setting up the PostGreSQL database (for admins)

If you want to setup a PostGreSQL server to store BlackDynamite data, then you have to follow this procedure.

Install the PSQL server:

bash
sudo apt-get install postgresql-9.4

You know need privileges to create databases and users. This can be done using the following:

You should add a database named blackdynamite (only the first time):

bash
psql --command "CREATE USER blackdynamite WITH PASSWORD '';"
createdb -O blackdynamite blackdynamite

Adding a user

You should create a user:

bash
psql --command "CREATE USER mylogin WITH PASSWORD 'XXXXX';"

And add permissions to create tables to the user

bash
psql --command "grant create on database blackdynamite to mylogin"

This can also be done with the commodity tool

bash
createUser.py --user admin_user --host hostname

Useful Postgresql commands

How to list the available schemas ?

psql
> \dn

How to get into the schema or the study ?

psql
> set search path to schema_name;
> set search path to study_name;

How to list all the tables ?

psql
> \d

psql
> SELECT * FROM pg_catalog.pg_tables;

How to list entries from a table (like the jobs) ?

psql
> SELECT * from table_name ;

How to list all the databases ?

psql
> \l

How to list the available databases ?

psql
> select datname from pg_catalog.pg_database;

How to know the current database ?

psql
> select current_database();

R3127/e98469d785bemaster

/