make a proper message for bad parsing
|Dec 11 2019|
renamed the option BDconf for consistence
|Nov 26 2019|
fix a little error preventing completion
|Nov 25 2019|
adding the license headers
|Aug 26 2019|
adding the license file
|Aug 26 2019|
Merge branch 'multiuser' of ssh://c4science.ch/diffusion/3127/blackdynamite…
|Aug 26 2019|
adding the LGPL header
|Aug 26 2019|
added pyparsing to setup.py dependencies
|Jul 9 2019|
adding underscore to right word definition in constraint
|Jul 4 2019|
python3 print in update runs
|Jul 3 2019|
fixed slurm option coating
|Apr 17 2019|
|Mar 21 2019|
fix of the example
|Mar 21 2019|
little fix to allow creation of new study
|Mar 21 2019|
change of interface for autmatic find of runs and jobs
|Mar 21 2019|
<center> <img width="50%" src=doc/Black-Dynamite.png/> </center>
bash sudo apt-get install python-psycopg2 sudo apt-get install python-numpy sudo apt-get install python-argcomplete
The easiest is through pip, that needs first to be installed:
bash sudo apt-get install python-pip
Then for a system wide installation (recommended):
bash sudo pip install git+https://c4science.ch/diffusion/3127/blackdynamite.git
bash sudo pip install git+ssh://firstname.lastname@example.org/diffusion/3127/blackdynamite.git
Then for a user scope installation:
bash pip install --user git+https://c4science.ch/diffusion/3127/blackdynamite.git
bash pip install --user git+ssh://email@example.com/diffusion/3127/blackdynamite.git
You can clone the GIT repository:
bash git clone ssh://firstname.lastname@example.org/diffusion/3127/blackdynamite.git
bash git clone https://c4science.ch/diffusion/3127/blackdynamite.git
To benefit the autocompletion for BlackDynamite the following steps are needed. You first need to install the argcomplete modules. Either by typing (Depending of your Ubuntu/Debian version) :
bash sudo apt-get install python-argcomplete
bash sudo apt-get install python-pip sudo pip install argcomplete
Then you must insert the following in your .bashrc
bash eval "$(register-python-argcomplete getRunInfo.py)" eval "$(register-python-argcomplete launchRuns.py)" eval "$(register-python-argcomplete canYouDigIt.py)" eval "$(register-python-argcomplete cleanRuns.py)" eval "$(register-python-argcomplete updateRuns.py)" eval "$(register-python-argcomplete enterRun.py)" eval "$(register-python-argcomplete enterRun.py)" eval "$(register-python-argcomplete saveBDStudy.py)"
In the .blackdynamite folder (in your home) you should add the servers where your databases are, with the option and information of your choice.
For each database you can add a file .bd of the name of the server (or an alias and specify the host inside:
bash host = yourHost.domain.countryID
It is also recommended to specify the password of the database to avoid typing it when using auto-completion.
Here is an example of a valid blackdynamite config file:
bash cat ~/.blackdynamite/lsmssrv1.epfl.ch.bd
bash host = lsmssrv1.epfl.ch password = XXXXXXXXX
Blackdynamite is merely a tool to help achieving a few things:
- Launching a program repeatedly with varying parameters, to explore the chosen parametric space.
- Collect and sort results of Small sizes benefiting from the power of modern databases.
- Analyze the results by making requests to the associated databases.
Launching is made simple by allowing any executable to be launched. The set of directories will be generated and managed by BlackDynamite to prevent errors. Requests of any kind will then be made to the underlying database through friendly commands of BlackDynamite.
Collecting the results will be possible thanks to the Blackdynamite C/C++ and python API which will let you send results directly to the database and thus automatically sort them. This is extremely useful. However heavy data such as Paraview files or any other kind of data should not be pushed to the database for obvious performance issues.
Analysis of the results can be made easy thanks to Blackdynamite which can retrieve data information in the form of Numpy array to be used, analyzed or plotted thanks to the powerful and vast Python libraries such as Matplotlib and Scipy.
The construction of a BlackDynamite parametric study follows these steps:
- Describing the parametric space
- Creating jobs (specific points in the parametric space)
- Creating runs (instances of the jobs)
- Launching runs
- Intrumenting the simulation to send results
- Analyzing the results
The first thing to do is to setup the table in the database associated with the study we want to perform. For this to be done you need, first of all, to list all the parameters that decide a specific case/computation. This parameters can be of simple types like string, integers, floats, etc. At current time no vectorial quantity can be considered as an input parameter. Once this list is done you need to create a script, usually named 'createDB.py' that will do this task. Let us examine such an example script.
First we need to set the python headers and to import the BlackDynamite modules by
python #!/usr/bin/env python import BlackDynamite as BD
Then you have to create a generic black dynamite parser and parse the system (including the connection parameters and credentials)
python parser = BD.BDParser() params = parser.parseBDParameters()
This mechanism allows to easily inherit from the parser mechanism of BlackDynamite, including the completion (if activated: see installation instructions). Then you can connect to the black dynamite database
python base = BD.base.Base(**params)
Then you have to define the parametric space (at present time, the parametric space cannot be changed once the study started: be careful with your choices). Any particular job is defined as a point in the parametric space. For instance, to create a job description and add the parameters with int, float or list parameters, you can use the following python sequence.
python myjob_desc = BD.job.Job(base) myjob_desc.types["param1"] = int myjob_desc.types["param2"] = float myjob_desc.types["param3"] = str
Important remark: Do not name your parameters like PostGreSQL keywords.
Aside of the jobs, a run will represent a particular realisation (computation) of a job. To get clearer, the run will contain information of the machine it was run on, the executable version, or the number of processors employed. For instance creating the run pattern can be done with:
python myruns_desc = run.Run(base) myruns_desc.types["compiler"] = str
There are default entries to the description of runs. These are:
- machine_name: the name of the machine where the run must be executed
- job_id (integer): the ID of the running job
- has_started (bool): flag to know whether the job has already started
- has_finished (bool): flag to know whether the job has already finished
- run_name (string): the name of the run
- wait_id (int): The id of a run to wait before starting
- start_time (TIMESTAMP): The start time for the run
Then you have to request for the creation of the database
You have to launch the script. As mentioned, all BlackDynamite scripts inherit from the parsing system. So that when needing to launch one of these codes, you can always claim for the valid keywords:
bash ./createDB.py --help usage: createDB.py [-h] [--job_constraints JOB_CONSTRAINTS] [--study STUDY] [--port PORT] [--host HOST] [--user USER] [--truerun] [--run_constraints RUN_CONSTRAINTS] [--yes] [--password] [--list_parameters] [--BDconf BDCONF] [--binary_operator BINARY_OPERATOR] BlackDynamite option parser optional arguments: -h, --help show this help message and exit General: --yes Answer all questions to yes. (default: False) --binary_operator BINARY_OPERATOR Set the default binary operator to make requests to database (default: and) BDParser: --job_constraints JOB_CONSTRAINTS This allows to constraint run selections by job properties (default: None) --study STUDY Specify the study from the BlackDynamite database. This refers to the schemas in PostgreSQL language (default: None) --port PORT Specify data base server port (default: None) --host HOST Specify data base server address (default: None) --user USER Specify user name to connect to data base server (default: None) --truerun Set this flag if you want to truly perform the action on base. If not set all action are mainly dryrun (default: False) --run_constraints RUN_CONSTRAINTS This allows to constraint run selections by run properties (default: None) --password Flag to request prompt for typing password (default: False) --list_parameters Request to list the possible job/run parameters (default: False) --BDconf BDCONF Path to a BlackDynamite file (*.bd) configuring current optons (default: None)
An important point is that most of the actions are only applied when the 'truerun' flag is set. Also, you always have to mention the host and the study you are working on (all scripts can apply to several studies). To launch the script and create the database you should launch:
bash ./createDB.py --host lsmssrv1.epfl.ch --study MysuperCoolStudy --truerun
The goal of the parametric study is to explore a subpart of the parametric space. We need to create jobs that are the points to explore. This script is usually named 'createJobs.py'.
We need to write a python script to generate this set of jobs. We start by setting the modules and the parser as for the 'createDB.py' script. Then we need to create job object:
bash job = job.Job(base)
It is up to us to decide the values to explore. for convenience, it is possible to insert ranges of values:
bash job["param1"] = 10 job["param2"] = [3.14,1.,2.] job["param3"] = 'toto'
This will create 3 jobs since we provided a range of values for the second parameter. The actual creation is made by calling:
Launching the script is made with:
python ./createJobs.py --host lsmssrv1.epfl.ch --study test --truerun
At this point the jobs are in the database. You need to create runs that will precise the conditions of the realization of the jobs. For example the machine onto which the job will run, path dependent information, executable information and others. We have to write the last script, usually named 'createRuns.py' to specify run creations.
Again we start with the modules. However this time, we can use another parser class more adapted to the manipulation of runs:
python parser = BD.RunParser() params = parser.parseBDParameters() base = BD.Base(**params)
The default parameters for runs will then be automatically included in the parameters.
python myrun = run.Run(base)
Some of the standard parameters might have been parsed directly by the RunParser, so that we have to forward them to the Run object:
A run now specify what action to perform to realize the job. Usually, an end-user has a script(s) and wish to attach it to the run. To attach a file you can for instance do:
Then, one has to specify which of these files is the entry point:
Finally, we have to create Run objects and attach them to jobs. The very first task is to claim the jobs from the database.
To that end the object JobSelector shall be your friend:
python jobSelector = BD.JobSelector(base) job_list = jobSelector.selectJobs()
This will return a job list that you can loop through and attach the runs to:
python for j in job_list: myrun['compiler'] = 'gcc' myrun.attachToJob(j)
Everything should then be committed to the database:
python if params["truerun"] is True: base.commit()
To create the run one should eventually launch the script by typing:
bash ./createRuns.py --host lsmssrv1.epfl.ch --study test --machine_name lsmspc41 --run_name toto --nproc int --truerun
The runs are eventually launched using the tool 'launchRuns.py'.
bash ./launchRuns.py --host lsmssrv1.epfl.ch --study test --outpath /home/user/ --truerun (--nruns int)
The runs can actually be controlled in the database with the tool 'getRunInfo.py', and one can go to the run folder with 'enterRun.py'. The runs are then launched using the tool 'launchRuns.py'.
bash ./getRunInfo.py --host lsmssrv1.epfl.ch --study test ./enterRun.py --host lsmssrv1.epfl.ch --study test --run_id ID
The status of the run can be manually modified using the command 'cleanRuns.py', the default status is CREATED (it can be turned to delete)
bash ./cleanRuns.py --host lsmssrv1.epfl.ch --study test (--runid ID) --truerun (--delete)
The status and the other run parameters (e.g. the compiler in the example file) can also be modified with 'updateRuns.py'. This can be done in the executed scrip to automatically set the selected parameter
bash updateRuns.py --host lsmssrv1.epfl.ch --study test --updates 'state = toto' --truerun
The function 'canYouDigIt.py' is an example of how to collect data in the runs to draw graphs. Example to plot the crack length in function of the time for different sigma_c (the study parameter):
bash canYouDigIt.py --host lsmssrv1.epfl.ch --study test --quantity time, crack_length --using %0.x:%1.y --xlabel 'time' --ylabel 'crack_length' --legend 'sigma_c = %j.sigma_c'
Eventually, the database can be saved in .zip format to be exported and used offline with 'saveBDStudy.py'.
Within you program you need a pusher correctly initialized in order to push data to the database. The 'test_blackdynamite.cc' is an example of such pusher.
First *blackdynamite* includes are required:
cpp #include "blackdynamite.hh"
Then you need to create a Pusher object and initialize it.
cpp BlackDynamite::RunManager bd; bd.startRun();
The constructor by default reads environment variables to get the database connection and schema informations:
- RUN_ID: the identifier of the runid
- SCHEMA: the schema where the parametric study is to be stored
- HOST: the database hostname
Then in the places where values are created you push the values to the database
cpp bd.push(val1,"quantity1",step); bd.push(val2,"quantity2",step);
Step is a stage identifier. It can be the step index within an explicit loop, or a within a convergence descent or whatever you whish. It will serve later to compare quantity entries.
Finally, when the job ended the following call inform the database that the run is finished:
Within your program you need a run object to push data to the database. This is done by selecting the run from the 'run_id' (usually passed as parameter).
python parser = BD.RunParser() params = parser.parseBDParameters() # params['run_id'] should exist mybase = BD.Base(**params) runSelector = BD.RunSelector(mybase) myrun = runSelector(params)
In order to have time entries for run times, the 'start()' and 'finish()' of the run need to be called.
python myrun.start() # ... # Important stuff # ... myrun.finish()
Pushing data is can be done with 'pushVectorQuantity()' and 'pushScalarQuantity()'.
python myrun.pushVectorQuantity(vector_quantity, step, "quantity_id", is_integer=False) myrun.pushScalarQuantity(scalar_quantity, step, "quantity_id", is_integer=False)
If you want to setup a PostGreSQL server to store BlackDynamite data, then you have to follow this procedure.
Install the PSQL server:
bash sudo apt-get install postgresql-9.4
You know need privileges to create databases and users. This can be done using the following:
You should add a database named blackdynamite (only the first time):
bash psql --command "CREATE USER blackdynamite WITH PASSWORD '';" createdb -O blackdynamite blackdynamite
You should create a user:
bash psql --command "CREATE USER mylogin WITH PASSWORD 'XXXXX';"
And add permissions to create tables to the user
bash psql --command "grant create on database blackdynamite to mylogin"
This can also be done with the commodity tool
bash createUser.py --user admin_user --host hostname
How to list the available schemas ?
psql > \dn
How to get into the schema or the study ?
psql > set search path to schema_name; > set search path to study_name;
How to list all the tables ?
psql > \d
psql > SELECT * FROM pg_catalog.pg_tables;
How to list entries from a table (like the jobs) ?
psql > SELECT * from table_name ;
How to list all the databases ?
psql > \l
How to list the available databases ?
psql > select datname from pg_catalog.pg_database;
How to know the current database ?
psql > select current_database();