# Database {#database} [TOC] This page describes the specmicp::database module and how to set up a database. # Introduction {#introduction} The term database is used in SpecMiCP to denotes the thermodynamic database. The thermodynamic database contains the information about the species and chemical reactions. The database is encoded as a JSON file. The database module is divided into two main parts, the data container (specmicp::database::DataContainer) and the database manager (specmicp::database::Database). The data container is shared by many objects through a shared pointer (specmicp::RawDatabasePtr). The creation and modification of the database should be done through the database manager. # Creating and customizing the database {#managing} ## Parsing the database {#parsing} The database can be created with the database manager : ~~~~~~~{.cpp} specmicp::database::Database database_manager(path_to_the_database); ~~~~~~~ ## Customization of the database {#customization} The database can be adapted to the simulation through the methods defined by specmicp::database::Database. The actual implementation of this algorithm are done by the different modules in the specmicp::database namespace. However, their direct use should be avoided and the methods provided by the \link specmicp::database::Database Database \endlink shoudl be used instead. The only exception is the specmicp::database::AqueousSelector that should normally not be used. ### Swapping the components The basis is defined in the JSON database but may be changed at the beginning of the simulation. Swapping the components may be done to help the convergence in the simulation. It is done by using the member function specmicp::database::Database::swap_components. ~~~~~~~~{.cpp} database_manager.swap_components({ {"H[+]", "HO[-]"}, {"Si(OH)4", "SiO(OH)3[-]"} }); ~~~~~~~~ Due to the use of the complementarity condition, the database do not need to be changed during the simulation \cite Georget2015. Note : liquid water "H2O" and the electron "E[-]" are fixed in the database and can NOT be swapped. ### Removing components Not all components are needed in every computation. The basis can be reduced by using one of this four methods - specmicp::database::Database::keep_only_components(const std::vector&) - specmicp::database::Database::keep_only_components(const std::vector&) - specmicp::database::Database::remove_components(const std::vector&) - specmicp::database::Database::remove_components(const std::vector&) The liquid water ("H2O") and the electron ("E[-]) cannot be removed from the database. The following code keep 5 components in the database (include "H2O" and "E[-]") : ~~~~~~~~~~~{.cpp} database_manager.keep_only_components({ "HO[-]", "Ca[2+]", "SiO(OH)3[-]" }); ~~~~~~~~~~~ ### Setting the list of solid phases It is not always desirable to include all the solid phases in the computation. The list of solid phases to take into account in the speciation solver can be set with the specmicp::database::Database::minerals_keep_only() method. ~~~~~~~~~~{.cpp} database_manager.minerals_keep_only({ "Portlandite", "SiO2(am)", "CSHjennite" }); ~~~~~~~~~~ The function is overloaded and can be given a list of id instead (specmicp::database::Database::minerals_keep_only(const std::vector&)). ### Adding species At the beginning of the simulation the user can add solid phases, gases or sorbed species in the database. The methods [add_solid_phases()](@ref specmicp::database::Database::add_solid_phases), [add_gas_phases()](@ref specmicp::database::Database::add_gas_phases), and [add_sorbed_species()](@ref specmicp::database::Database::add_sorbed_species) are used to add species in the database. Their argument is a JSON formatted string. For a description of the format, see the format section below. Example : ~~~~~~~{.cpp} std::string gas_appendix_database = R"plop( [ { "label": "CO2(g)", "composition": "CO2(aq)", "log_k": -1.468 } ] )plop"; database_manager.add_gas_phases(gas_appendix_database) ~~~~~~~ ## Obtaining the raw database {#container} Once the database is prepared, a pointer to the raw database can be obtained with the specmicp::database::Database::get_database() method. \code{.cpp} specmicp::RawDatabasePtr raw_data = database_manager.get_database() \endcode ## Freezing and validating the database {#validation} The consistency of the database is a requirement for the computation. This consistency can be check with the specmicp::database::DataContainer::is_valid() function. It checks if all of the species list of the database are valid. It does not check the values, just that the modification algorithms happened according to plan. ~~~~~~~{.cpp} if (not raw_data->is_valid()) throw std::runtime_error("Doomed, we are doomed my friends") ~~~~~~~ The check is not run automatically in non-debug mode but it is recommended to used it at least once in a simulation. To detect any undesired modification, the database can be frozen. To do so call the method specmicp::database::DataContainer::freeze_db(). It does not prevent any modification to the database, but any call to [is_valid()](@ref specmicp::database::DataContainer::is_valid) will check that the database has not changed since the call to [freeze_db() ](@ref specmicp::database::DataContainer#freeze_db). The definition of "No changes" is quite loose and again the numericals values are not checked, only the names of the species in the database. ~~~~~~~{.cpp} raw_data->freeze_db() // code if (not raw_data->is_valid()) throw std::runtime_error("Doomed, we are doomed my friends, the database has changed.") ~~~~~~~ To avoid any undesired modification of the values, the getter methods defined in specmicp::database::DataContainer are read-only. Any modification of the values is non-obvious and thus is considered to be voluntary. ## Saving the database {#saving} The database can be saved in the JSON format for inspection or later use. ~~~~~~~{.cpp} specmicp:database::DatabaseManager database_manager = ... // obtain and transform the database database_manager.save("file_where_db_is_saved.dat") ~~~~~~~ The database is human and machine readable. It can be parsed by SpecMiCP to run further computations. # Labels and IDs {#using} Outside of the programs/solver the main use of the database is to make the link between the ID and the label of the species. The ID of the species is simply its position in a vector. The label of a species is its name in the database. For efficiency reasons, in the computation the ID of the species are used. However in the setup and the post-processing of a simulation using the labels improves the readability and the robustness and should be used. The format of a label is described [here](#format_label). Note 0 : The IDs are not unique. They are unique per types of species. For example, the first species in all the list have ID "0". Note 1 : The C convention is used, the first ID is "0". For exemple water ("H2O") has ID "0" in the basis. Note 2 : There is three sets of 'ID' that should not be confused : - ID : this is the ID of a species in the database - dof : this is the degree of freedom of a variable in a program to solve - ideq : ID of an equation, this is the row for a varaible/equation in the linear system to solve The user should only worry about the first one. To implement a new equation and Note 3: The algorithms seems to be stable and the ID are predictable. However, this is not a guarantee and the user should used the methods described in this section to find the ID of a species. ## label to ID {#ID} There is two versions available, a 'safe' and 'unsafe' versions. The safe version (prefixed with safe_) will throw std::runtime_error if the label does not exist. The unsafe version will return specmicp::no_species if the species does not exist. Safe versions : - [safe_component_label_to_id()](@ref specmicp::database::Database::safe_component_label_to_id) - [safe_aqueous_label_to_id()](@ref specmicp::database::Database::safe_aqueous_label_to_id) - [safe_mineral_label_to_id()](@ref specmicp::database::Database::safe_mineral_label_to_id) - [safe_mineral_kinetic_label_to_id()](@ref specmicp::database::Database::safe_mineral_kinetic_label_to_id) - [safe_gas_label_to_id()](@ref specmicp::database::Database::safe_gas_label_to_id) - [safe_sorbed_label_to_id()](@ref specmicp::database::Database::safe_sorbed_label_to_id) Unsafe versions : - [component_label_to_id()](@ref specmicp::database::Database::component_label_to_id) - [aqueous_label_to_id()](@ref specmicp::database::Database::aqueous_label_to_id) - [mineral_label_to_id()](@ref specmicp::database::Database::mineral_label_to_id) - [mineral_kinetic_label_to_id()](@ref specmicp::database::Database::mineral_kinetic_label_to_id) - [gas_label_to_id()](@ref specmicp::database::Database::gas_label_to_id) - [sorbed_label_to_id()](@ref specmicp::database::Database::sorbed_label_to_id) ## ID to label {#label} These methods return the label of a species given its ID and its type. There is no safe and unsafe version, it is up to the user to ensure that the id is valid. In debug mode, the bounds are check using an assertion. Available methods : - [component_id_to_label()](@ref specmicp::database::Database::component_id_to_label) - [aqueous_id_to_label()](@ref specmicp::database::Database::aqueous_id_to_label) - [mineral_id_to_label()](@ref specmicp::database::Database::mineral_id_to_label) - [mineral_kinetic_id_to_label()](@ref specmicp::database::Database::mineral_kinetic_id_to_label) - [gas_id_to_label()](@ref specmicp::database::Database::gas_id_to_label) - [sorbed_id_to_label()](@ref specmicp::database::Database::sorbed_id_to_label) ## low-level {#label_low} In the computation, it is not necessary to create a [database manager](@ref specmicp::database::Database) to obtain the labels and IDs. The user may use the methods defined by specmicp::database::DataContainer The following methods are available : ### ID to label - [get_id_component(const std::string& label)](@ref specmicp::database::DataContainer::get_id_component) - [get_id_aqueous(const std::string& label)](@ref specmicp::database::DataContainer::get_id_aqueous) - [get_id_mineral(const std::string& label)](@ref specmicp::database::DataContainer::get_id_mineral) - [get_id_mineral_kinetic(const std::string& label)](@ref specmicp::database::DataContainer::get_id_mineral_kinetic) - [get_id_gas(const std::string& label)](@ref specmicp::database::DataContainer::get_id_gas) - [get_id_sorbed(const std::string& label)](@ref specmicp::database::DataContainer::get_id_sorbed) ### label to ID - [get_label_component(index_t id)](@ref specmicp::database::DataContainer::get_label_component) - [get_label_aqueous(index_t id)](@ref specmicp::database::DataContainer::get_label_aqueous) - [get_label_mineral(index_t id)](@ref specmicp::database::DataContainer::get_label_mineral) - [get_label_mineral_kinetic(index_t id)](@ref specmicp::database::DataContainer::get_label_mineral_kinetic) - [get_label_gas(index_t id)](@ref specmicp::database::DataContainer::get_label_gas) - [get_label_sorbed(index_t id)](@ref specmicp::database::DataContainer::get_label_sorbed) ### Size The number of species in a list can be obtained through the following functions : - [nb_component()](@ref specmicp::database::DataContainer::nb_component) - [nb_aqueous()](@ref specmicp::database::DataContainer::nb_aqueous) - [nb_mineral()](@ref specmicp::database::DataContainer::nb_mineral) - [nb_mineral_kinetic()](@ref specmicp::database::DataContainer::nb_mineral_kinetic) - [nb_gas()](@ref specmicp::database::DataContainer::nb_gas) - [nb_sorbed()](@ref specmicp::database::DataContainer::nb_sorbed) The maximum index ist the number of species - 1. ### Range To iterate other a list of species the following methods are defined : - [range_component()](@ref specmicp::database::DataContainer::range_component) - [range_aqueous_component()](@ref specmicp::database::DataContainer::range_aqueous_component) - [range_aqueous()](@ref specmicp::database::DataContainer::range_aqueous) - [range_mineral()](@ref specmicp::database::DataContainer::range_mineral) - [range_mineral_kinetic()](@ref specmicp::database::DataContainer::range_mineral_kinetic) - [range_gas()](@ref specmicp::database::DataContainer::range_gas) - [range_sorbed()](@ref specmicp::database::DataContainer::range_sorbed) The [range_aqueous_component()](@ref specmicp::database::DataContainer::range_aqueous_component) method do not include the water ("H2O") and the electron ("E[-]"). The following example print the labels of the aqueous species : ~~~~~~~~{.cpp} std::cout << "List of Aqueous species : \n"; for (auto id: raw_data->range_aqueous()) std::cout << " - " << raw_data->get_label_aqueous(id) << "\n"; std::cout << std::endl; ~~~~~~~~ # Thermodynamics data {#thermo} The database contains the stoichiometric coefficients, the equilibrium constant, the non ideal aqueous solutions parameters, the molar volume, etc. In short, all the thermodynamics data needed for the computations. The complete description of the available data can be find [here](@ref specmicp::database::DataContainer). Note : the thermodynamics data is only available through the [DataContainer](@ref specmicp::database::DataContainer) class, not the [Database](@ref specmicp::database::Database) class which is reserved for the [customization algorithms](#managing). # Format of a database {#jsonformat} The database is saved as a JSON file. JSON is a concise format to exchange data. This section describes the format of the database. The database is a list of sections : ~~~~~{.js} { "Metadata": {...}, "Basis" [...], "Aqueous": [...], "Minerals": [...], ("Gas": [...],) ("Sorbed": [...],) "References: [...] } ~~~~~ where `( : )` denotes an optional section. The JSON format is extensible and extra information can be included into the database. However, they will be ignored by SpecMiCP. The format of the database was chosen to avoid any ambiguity and obtained a database that is easily parsed in opposition of other tools. The objective is that other tools are able to implement a parser for the same format. ## Labels {#format_label} A label is formatted as : `[]`. The charge consist of a number and a sign. If the number is 1 it can be omited. If the charge is 0, the brackets can be omited. For example, the following labels are valid : - `H2O` - `E[-]` - `HO[-]` - `Ca[2+]` - `Al(OH)4` - `AlO(OH)3[-]` - `AlO2(OH)2[2-]` ## Chemical reactions {#format_equations} A chemical reactions in the database always denotes the dissociation of the species (decomplexation, dissolution, ...), for this reason, in the JSON file it is called a composition. A composition is simply a comma-separated list of a species preceded by a stoichiometric coefficient. A stoichiometric coefficient is a sign (`+`/`-`) followed by a number. If the sign is `+` it can be omitted. If the number is `1` it can be omitted. Examples : `Al(OH)4[-]` : `Al[3+], 4 H2O, -4 H[+]` `S2O3[2-]` : `2.00 SO4[2-], 10.00 H[+], 8.00 E[-], -5.00 H2O` The charged neutrality of the equation is tested during the parsing of the database. ## Metadata section {#format_metadata} This section describes meta-information such as the name and the version of the database. ~~~~~~~~{.js} "Metadata": { "name": , "version": }, ~~~~~~~~ JSON is an extensible format and the user may add a changelog entry to describe the modifications made at each version. ## Basis section {#format_basis} This section defines the basis used to build the database. It is a list of entry formatted as : ~~~~~~~~{.js} { "label":