SMILES from graph
Asked Answered
G

1

13

Is there a method or package that converts a graph (or adjacency matrix) into a SMILES string?

For instance, I know the atoms are [6 6 7 6 6 6 6 8] ([C C N C C C C O]), and the adjacency matrix is

[[ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],

 [ 1.,  0.,  2.,  0.,  0.,  0.,  0.,  1.],

 [ 0.,  2.,  0.,  1.,  0.,  0.,  0.,  0.],

 [ 0.,  0.,  1.,  0.,  1.,  0.,  0.,  0.],

 [ 0.,  0.,  0.,  1.,  0.,  1.,  0.,  0.],

 [ 0.,  0.,  0.,  0.,  1.,  0.,  1.,  1.],

 [ 0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.],

 [ 0.,  1.,  0.,  0.,  0.,  1.,  0.,  0.]]

I need some function to output 'CC1=NCCC(C)O1'.

It also works if some function can output the corresponding "mol" object. The RDkit software has a 'MolFromSmiles' function. I wonder if there is something like 'MolFromGraphs'.

Guanajuato answered 5/7, 2018 at 15:44 Comment(0)
A
15

Here is a simple solution, to my knowledge there is no built-in function for this in RDKit.

def MolFromGraphs(node_list, adjacency_matrix):

    # create empty editable mol object
    mol = Chem.RWMol()

    # add atoms to mol and keep track of index
    node_to_idx = {}
    for i in range(len(node_list)):
        a = Chem.Atom(node_list[i])
        molIdx = mol.AddAtom(a)
        node_to_idx[i] = molIdx

    # add bonds between adjacent atoms
    for ix, row in enumerate(adjacency_matrix):
        for iy, bond in enumerate(row):

            # only traverse half the matrix
            if iy <= ix:
                continue

            # add relevant bond type (there are many more of these)
            if bond == 0:
                continue
            elif bond == 1:
                bond_type = Chem.rdchem.BondType.SINGLE
                mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)
            elif bond == 2:
                bond_type = Chem.rdchem.BondType.DOUBLE
                mol.AddBond(node_to_idx[ix], node_to_idx[iy], bond_type)

    # Convert RWMol to Mol object
    mol = mol.GetMol()            

    return mol
Chem.MolToSmiles(MolFromGraphs(nodes, a))

Out:
'CC1=NCCC(C)O1'

This solution is a simplified version of https://github.com/dakoner/keras-molecules/blob/dbbb790e74e406faa70b13e8be8104d9e938eba2/convert_rdkit_to_networkx.py

There are many other atom properties (such as Chirality or Protonation state) and bond types (Triple, Dative...) that may need to be set. It is better to keep track of these explicitly in your graph if possible (as in the link above), but this function can also be extended to incorporate these if required.

Ahern answered 9/7, 2018 at 9:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.