Molecule
A Molecule defines a collection of atoms by their geometry (in bohr), atomic numbers, charge and the multiplicity. It can be created from a variety of sources and data formats.
The Molecule building block is a critical input (and often output) of most workflows.
Creating a new Molecule
A common method of Molecule creation is to directly set the fields:
import sierra
from sierra.inputs import *
# Build a molecule from raw data, note the distances are in Bohr
he2 = Molecule(atomic_numbers=[2, 2], geometry=[0, 0, 0, 0, 0, 5])
print(he2)
#> Molecule(formula='He2', eoi='c8de2a8')
print(he2.measure([0, 1]))
#> 5.0
Here, the charge and multiplicity are set to the defaults of 0 and 1, respectively.
Element symbols can also be used in place of atomic numbers for initialization:
from sierra.inputs import *
# Build a molecule from symbols, note the distances are in Bohr
he2 = Molecule(symbols=["He", "He"], geometry=[0, 0, 0, 0, 0, 5])
print(he2)
#> Molecule(formula='He2', eoi='c8de2a8')
Importing common file formats
It is also common to construct a Molecule from SDF, XYZ or XYZ+ text. These formats specify positions in Angstrom and Molecule will convert these to Bohr to store in the geometry field.
A compatible file can be loaded with the file field:
from sierra.inputs import *
# Build a molecule from a SDF, XYZ or XYZ+ file
water = Molecule(file="examples/atoms.xyz")
or the file content can be passed to the data field:
from sierra.inputs import *
# Build a molecule from SDF, XYZ or XYZ+ contents
# Note the distances are in Angstrom
water = Molecule(
data="""
O 0 0 0
H 0 0 1
H 0 1 0
"""
)
print(water)
#> Molecule(formula='H2O', eoi='b7a2e71')
Generating from a SMILES string
A molecule can be straightforwardly generated from a smiles string.
from sierra.inputs import *
butane = Molecule(smiles="CCCC")
print(butane)
#> Molecule(formula='C4H10', eoi='5c33557')
Here, our internal conformers tools are used to generate a structure from the SMILES string. Note that this implementation prioritizes speed of execution to obtain a reasonable structure rather than a rigorous conformational search. Please use the Conformer workflow for full control over geometry generation.
Importing from PubChem
A very useful form of making a Molecule is via the PubChem interface. The pubchem attribute can be used to automatically search pubchem for the best common name match and generate a Molecule.
from sierra.inputs import *
caffeine = Molecule(pubchem="caffeine")
print(caffeine)
#> Molecule(formula='C8H10N4O2', eoi='328969a')
Warning
The pubchem interface sends data to PubChem servers and should not be used for proprietary material. This is the only operation in Sierra which reaches to an outside server, all other calls, including the Conformer workflow, run locally.
Exporting a Molecule
Molecule objects can easily be exported to a file or a string variable in XYZ+ format:
from pathlib import Path
from sierra.inputs import *
mol = Molecule(pubchem="caffeine")
# Write a molecule as XYZ+ format
xyz_text = mol.write()
print(xyz_text)
"""
24
0 1
O 0.470000000755 2.568800000562 0.000600002289
O -3.127099997715 -0.443600000591 -0.000300001144
N -0.968599997718 -1.312500000756 0.000000000000
N 2.218200000188 0.141200000637 -0.000300001144
N -1.347700000975 1.079700001715 -0.000099998618
N 1.411900002456 -1.937200000728 0.000200002527
C 0.857899998774 0.259199999205 -0.000799999524
C 0.389699999594 -1.026399997716 -0.000399999762
C 0.030699998928 1.422000000414 -0.000600002289
C -1.906100002038 -0.249500001009 -0.000399999762
C 2.503200002560 -1.199799997711 0.000300001144
C -1.427600002373 -2.695999998946 0.000799999524
C 3.192600002392 1.206100000576 0.000300001144
C -2.296900002300 2.188099998257 0.000700000906
H 3.516299998930 -1.578699998441 0.000799999524
H -1.045099998494 -3.197300000918 -0.893700001282
H -2.518600001332 -2.759599998139 0.001100000668
H -1.044699998732 -3.196299998867 0.895700000091
H 4.199199998662 0.780100000095 0.000200002527
H 3.046800001847 1.809199997527 -0.899199999331
H 3.046599999320 1.808299999386 0.900399998617
H -1.808699999148 3.165100001559 -0.000300001144
H -2.932199998609 2.102699998809 0.888099999323
H -2.934600002473 2.102100001812 -0.884900001227
"""
# Write to a file
mol.write(filename=Path("caffeine.xyz+"))
Fields
atomic_numbers-
The
(n, )array of atomic numbers of the atoms.- Type:
Array[int] - Additional Details: shape:
(-1,)
- Type:
charge-
The overall charge of the atoms.
- Type:
int - Default:
0
- Type:
geometry-
The
(n, 3)array of coordinates of the atoms in units of Bohr.- Type:
Array[float] - Additional Details: shape:
(-1, 3)
- Type:
masses-
The
(n, )array of masses of the atoms.- Type:
Array[float] - Additional Details: shape:
(-1,)
- Type:
multiplicity-
A value of
Nonerefers to the lowest multiplicity given the electron number parity.- Type:
int - Default:
None
- Type:
symbols-
The
(n, )array of symbols of the atoms.- Type:
Array[str] - Additional Details: shape:
(-1,)
- Type: