Home

1. Installation

All setup functions are in Makefile. To see what functions Makefile has:

make -f Makefile

To install the DTNN_7ib, first we need to create a new environment:

make Makefile create_environment

Then, activate DTNN_7ib environemt, and all requirements can be installed automatically using Makefile:

source activate DTNN_7ib
make Makefile requirements

Note

All neural networks are built by Tensorflow, the detailed installation can be found at :

https://www.tensorflow.org/install

2. Data Prepartion

All raw dataset and property files are in src/data/raw directory.

The data we used in our paper are in src/data/external directory.

All scripts related to data prepartion are in src/data directory. To check the options for data prepartion:

python prepare_dataset.py --help

The default option is to prepare training, validation, testlive, and test set using qm9mmff dataset:

python prepare_dataset.py

The inputdir (defult is data/raw)has initial dataset files in tfrecord/tar format. The output files will be in outputdir (defult is data/processed/qm9mmff), which include train.tfrecord, testlive.tfrecord, validation, and test.tfrecord files.

You can also change the inputdir or outputfir as flowing:

python prepare_dataset.py --inputdir user_dir
python prepare_dataset.py --outputdir user_dir

To build training, validation, testlive, and test set from eMol9_C_M:

python prepare_dataset.py --datatype emol9mmff

To build test set using Plati_C_M

python prepare_dataset.py --datatype platinummmff

3. Model Development

All scipts for model training are in src/models directory. To check the options for model developmet:

python train_model.py --help

Many hyperparameters can be changed by user

The default option is to train DTNN_7ib model, using same hyperparameters as our paper:

python train_model.py --addnewm

Train TL_QM9_M:

python train_model.py --geometry MMFF --transferlearning

Here, input geometries are changed from DFT optimized to MMFF optimized, and --transferlearning is the flag for transfer learning.

Train TL_eMol9_C_M:

python train_model.py --datatype emol9mmff --geometry MMFF --transferlearning

Here, eMol9_C_M dataset has been used to train model for conformations.

4. Model Evaluation

All scipts for model evaluation are in src/models directory. Our pretrained models are in models. Here, we provides the best model trained with five different random seed.
To check the options for model evaluation:

python predict_model.py --help

Performanc of DTNN_7id on QM9:

python predict_model.py

Performanc of DTNN_7id on QM9_M:

python predict_model.py --testpositions mmffpositions

--testpositions is used for changing input geometries.
For QM9_M, positions is for DFT optimized geometries, and mmffpositions is for MMFF optimized geometeis.
For eMol9_C_M and Plati_C_M, positions1 is for MMFF optimized geometries, and positions2 is for DFT optimized geomstries.

Performance of TL_QM9_M on QM9_M:

python predict_model.py --modelname TL_QM9_name --testpositions mmffpositions

Here, TL_QM9_name should be changed into dir of trained TL_QM9_M model.

Peformance of TL_eMol9_C_M on eMol9_C_M:

python predict_model.py --modelname TL_eMOL9_CM_name --testtype emol9mmff --testpositions positions1

Peformance of TL_eMol9_C_M on Plati_C_M:

python predict_model.py --modelname TL_eMOL9_CM_name --testtype platinummmff --testpositions positions1

Molecular Energy Prediction

Tutorial

1. Installation

2. Data Prepartion

3. Model Development

4. Model Evaluation