Molecule
A Molecule
defines a collection of atoms by their geometry (in bohr), atomic numbers, charge and the multiplicity. It can be created from a variety of sources and data formats.
The Molecule
building block is a critical input (and often output) of most workflows.
Creating a new Molecule
A common method of Molecule
creation is to directly set the fields:
import sierra
from sierra.inputs import *
# Build a molecule from raw data, note the distances are in Bohr
he2 = Molecule(atomic_numbers=[2, 2], geometry=[0, 0, 0, 0, 0, 5])
print(he2)
#> Molecule(formula='He2', eoi='c8de2a8')
print(he2.measure([0, 1]))
#> 5.0
Here, the charge
and multiplicity
are set to the defaults of 0
and 1
, respectively.
Element symbols can also be used in place of atomic numbers for initialization:
from sierra.inputs import *
# Build a molecule from symbols, note the distances are in Bohr
he2 = Molecule(symbols=["He", "He"], geometry=[0, 0, 0, 0, 0, 5])
print(he2)
#> Molecule(formula='He2', eoi='c8de2a8')
Importing common file formats
It is also common to construct a Molecule
from SDF, XYZ or XYZ+ text. These formats specify positions in Angstrom and Molecule
will convert these to Bohr to store in the geometry
field.
A compatible file can be loaded with the file
field:
from sierra.inputs import *
# Build a molecule from a SDF, XYZ or XYZ+ file
water = Molecule(file="examples/atoms.xyz")
or the file content can be passed to the data
field:
from sierra.inputs import *
# Build a molecule from SDF, XYZ or XYZ+ contents
# Note the distances are in Angstrom
water = Molecule(
data="""
O 0 0 0
H 0 0 1
H 0 1 0
"""
)
print(water)
#> Molecule(formula='H2O', eoi='b7a2e71')
Generating from a SMILES string
A molecule can be straightforwardly generated from a smiles
string.
from sierra.inputs import *
butane = Molecule(smiles="CCCC")
print(butane)
#> Molecule(formula='C4H10', eoi='5c33557')
Here, our internal conformers tools are used to generate a structure from the SMILES string. Note that this implementation prioritizes speed of execution to obtain a reasonable structure rather than a rigorous conformational search. Please use the Conformer workflow for full control over geometry generation.
Importing from PubChem
A very useful form of making a Molecule
is via the PubChem interface. The pubchem
attribute can be used to automatically search pubchem for the best common name match and generate a Molecule
.
from sierra.inputs import *
caffeine = Molecule(pubchem="caffeine")
print(caffeine)
#> Molecule(formula='C8H10N4O2', eoi='328969a')
Warning
The pubchem
interface sends data to PubChem servers and should not be used for proprietary material. This is the only operation in Sierra which reaches to an outside server, all other calls, including the Conformer workflow, run locally.
Exporting a Molecule
Molecule
objects can easily be exported to a file or a string variable in XYZ+ format:
from pathlib import Path
from sierra.inputs import *
mol = Molecule(pubchem="caffeine")
# Write a molecule as XYZ+ format
xyz_text = mol.write()
print(xyz_text)
"""
24
0 1
O 0.470000000755 2.568800000562 0.000600002289
O -3.127099997715 -0.443600000591 -0.000300001144
N -0.968599997718 -1.312500000756 0.000000000000
N 2.218200000188 0.141200000637 -0.000300001144
N -1.347700000975 1.079700001715 -0.000099998618
N 1.411900002456 -1.937200000728 0.000200002527
C 0.857899998774 0.259199999205 -0.000799999524
C 0.389699999594 -1.026399997716 -0.000399999762
C 0.030699998928 1.422000000414 -0.000600002289
C -1.906100002038 -0.249500001009 -0.000399999762
C 2.503200002560 -1.199799997711 0.000300001144
C -1.427600002373 -2.695999998946 0.000799999524
C 3.192600002392 1.206100000576 0.000300001144
C -2.296900002300 2.188099998257 0.000700000906
H 3.516299998930 -1.578699998441 0.000799999524
H -1.045099998494 -3.197300000918 -0.893700001282
H -2.518600001332 -2.759599998139 0.001100000668
H -1.044699998732 -3.196299998867 0.895700000091
H 4.199199998662 0.780100000095 0.000200002527
H 3.046800001847 1.809199997527 -0.899199999331
H 3.046599999320 1.808299999386 0.900399998617
H -1.808699999148 3.165100001559 -0.000300001144
H -2.932199998609 2.102699998809 0.888099999323
H -2.934600002473 2.102100001812 -0.884900001227
"""
# Write to a file
mol.write(filename=Path("caffeine.xyz+"))
Fields
atomic_numbers
-
The
(n, )
array of atomic numbers of the atoms.- Type:
Array[int]
- Additional Details: shape:
(-1,)
- Type:
charge
-
The overall charge of the atoms.
- Type:
int
- Default:
0
- Type:
geometry
-
The
(n, 3)
array of coordinates of the atoms in units of Bohr.- Type:
Array[float]
- Additional Details: shape:
(-1, 3)
- Type:
masses
-
The
(n, )
array of masses of the atoms.- Type:
Array[float]
- Additional Details: shape:
(-1,)
- Type:
multiplicity
-
A value of
None
refers to the lowest multiplicity given the electron number parity.- Type:
int
- Default:
None
- Type:
symbols
-
The
(n, )
array of symbols of the atoms.- Type:
Array[str]
- Additional Details: shape:
(-1,)
- Type: