pychemia.db package

One core capability of PyChemia is global optimization of atomic structures such as clusters and crystals. Doing global optimization requires compute forces and energies of hundreds or even thousands of structures and all these data must be store and processed efficiently.

PyChemia relies of MongoDB to store structures and the properties computed by atomistic codes. This module creates and manipulates Mongo databases. There are two kinds of databases defined on PyChemia: __PyChemiaDB__ is a kind of database to store structure and properties. __PyChemiaQueue__ is a repository of calculations.

In the case of Global searcher PyChemiaDB contains several collections, such as

  • pychemia_entries: The main collection, contains all the structures, properties and status info
  • fingeprints: Collection with using the same IDs as pychemia_entries storing the fingerprints of the corresponding
    structures
  • generations: Collection with the same IDs as pychemia_entries to store on which generation each structure belongs.
  • generation_changes: Collection storing the changes introduced from one generation to the next one.
  • population_info: Stores the basic parameters entered to create the population
  • searcher_info: Store parameters about the searcher use to populate the database
  • lineage: Store the heritage of structures produced by the global searcher

To do a more efficient executiion of calculations using several computer clusters and dedicated machines, PyChemia provides a central database of calculations that can be used to feed clusters with execution jobs. The PyChemiaQueue is used to fill the role of a meta-queue for structures and jobs that need to be computed.

Submodules

pychemia.db.db module

class pychemia.db.db.PyChemiaDB(name='pychemiadb', host='localhost', port=27017, user=None, passwd=None, ssl=False, replicaset=None)[source]

Bases: object

clean()[source]
create_static(field)[source]
find_AnBm(specie_a=None, specie_b=None, n=1, m=1)[source]

Search for structures with a composition expressed as AnBm where one and only one between A or B is fixed and the numbers amounts n and m are both fixed

Parameters:
  • specie_a – (str) atom symbol for the first specie
  • specie_b – (str) atom symbol for the second specie
  • n – number of atoms for specie ‘a’
  • m – number of atoms for specie ‘b’
Returns:

Returns:

(list) List of ids for all the structures that fulfill the conditions

find_composition(composition)[source]

Search for structures with a pseudo-composition expressed as dictionary where symbols that are not atomic symbols such as A or X can be used to represent arbitrary atoms

Returns:(list) List of ids for all the structures that fulfill the conditions
get_dicts(entry_id)[source]

Return a tuple with the fields in an entry structure, properties and status

:rtype : tuple

get_entry(entry_id)[source]
get_structure(entry_id)[source]

Return the structure in the entry with id ‘entry_id’

:rtype : Structure

get_tags()[source]
insert(structure, properties=None, status=None)[source]

Insert a pychemia structure instance and properties into the database :param structure: (pychemia.Structure) An instance of Pychemia’s Structure :param properties: (dict) Dictionary of properties :param status: (dict) Dictionary of status :return:

is_locked(entry_id)[source]

Return if a given entry is locked by someone evaluating the structure contained

:rtype : bool

lock(entry_id, name=None)[source]
map_to_all(function, nparal=6)[source]
replace_failed()[source]
save_json(filename='db_settings.json')[source]
set_minimal_schema()[source]
unlock(entry_id, name=None)[source]
update(entry_id, structure=None, properties=None, status=None)[source]

Update the fields ‘structure’, ‘properties’ or ‘status’ for a given identifier ‘entry_id’

Parameters:
  • entry_id – (ObjectID, str)
  • structure – (pychemia.Structure) Structure to update
  • properties – (dict) Dictionary of properties to update
  • status – (dict) Status dictionary
Returns:

The identifier for the entry that was updated

:rtype : ObjectId

pychemia.db.db.create_database(name, admin_name, admin_passwd, user_name, user_passwd, host='localhost', port=27017, ssl=False, replicaset=None)[source]
pychemia.db.db.create_user(name, admin_name, admin_passwd, user_name, user_passwd, host='localhost', port=27017, ssl=False, replicaset=None)[source]

Creates a new user for the database ‘name’

Parameters:
  • name – (str) The name of the database
  • admin_name – (str) The administrator name
  • admin_passwd – (str) Administrator password
  • user_name – (str) Username for the database
  • user_passwd – (str) Password for the user
  • host – (str) Name of the host for the MongoDB server (default: ‘localhost’)
  • port – (int) Port to connect to the MongoDB server (default: 27017)
  • ssl – (bool) If True enable ssl encryption for communications to host (default: False)
  • replicaset – (str, None) Identifier of a Replica Set
pychemia.db.db.get_database(db_settings)[source]
pychemia.db.db.has_connection()[source]
pychemia.db.db.object_id(entry_id)[source]

pychemia.db.queue module

class pychemia.db.queue.PyChemiaQueue(name='Queue', host='localhost', port=27017, user=None, passwd=None, ssl=False, replicaset=None)[source]

Bases: object

add_file(entry_id, location, filepath)[source]
add_input_file(entry_id, filename)[source]
get_input_structure(entry_id)[source]
get_input_variables(entry_id)[source]
get_output_structure(entry_id)[source]
get_structure(entry_id, location)[source]

Return the structure in the entry with id ‘entry_id’

:rtype : Structure

new_entry(structure=None, variables=None, code=None, files=None, priority=0, dbname=None, db_id=None)[source]
set_input(entry_id, code, inputvar)[source]
set_input_structure(entry_id, structure)[source]
set_job_settings(entry_id, nparal=None, queue=None, nhours=None, mail=None, task_name=None, task_settings=None, task_kind=None)[source]
set_minimal_schema()[source]
set_output_structure(entry_id, structure)[source]
set_structure(entry_id, location, structure)[source]
write_input_files(entry_id, destination=None)[source]

pychemia.db.repo module

There are two kinds of Repositories in PyChemia

Structure Repositories where many structures are stored Execution Repositories where the out of every calculation is stored

Each structure contains some metadata that is accessible with the StructureEntry object Also each calculation has it own metadata accessible by ExecutionEntry object

class pychemia.db.repo.ExecutionRepository[source]

Bases: object

Defines the location and properties of the Repository where all the executions will be stored

class pychemia.db.repo.PropertiesEntry(structure_entry)[source]

Bases: object

Defines one calc in the Execution Repository

add_property(name, values)[source]
load()[source]

Loads an existing db from its configuration file

save()[source]

Save an existing repository information

class pychemia.db.repo.StructureEntry(structure=None, repository=None, identifier=None, original_file=None, tags=None)[source]

Bases: object

Defines one entry in the repository of Structures

add_children(children)[source]
add_original_file(filep)[source]
add_parents(parents)[source]
add_tags(tags)[source]
load()[source]
load_originals()[source]
metadatafromdict(entrydict)[source]
metadatatodict()[source]
save()[source]
class pychemia.db.repo.StructureRepository(path)[source]

Bases: object

Defines the location of the executions repository and structure repository and methods to add, remove and check those db

add_entry(entry)[source]

Add a new StructureEntry into the repository

add_many_entries(list_of_entries, tag, number_threads=1)[source]
clean()[source]
del_entry(entry)[source]
fromdict(repos_dict)[source]
get_all_entries
get_formulas()[source]
load()[source]

Loads an existing db from its configuration file

merge(other)[source]

Add all the contents from other db into the calling object

Parameters:other – StructureRepository
merge2entries(orig, dest)[source]
rebuild()[source]
refine()[source]
save()[source]

Save an existing repository information

structure_entry(ident)[source]
todict()[source]

Serialize the values of the db into a dictionary