All setup functions are in Makefile
. To see what functions Makefile
has:
make -f Makefile
To install the DTNN_7ib, first we need to create a new environment:
make Makefile create_environment
Then, activate DTNN_7ib environemt, and all requirements can be installed automatically using Makefile
:
source activate DTNN_7ib
make Makefile requirements
Note
All neural networks are built by Tensorflow, the detailed installation can be found at :
All raw dataset and property files are in src/data/raw
directory.
The data we used in our paper are in src/data/external
directory.
All scripts related to data prepartion are in src/data
directory. To check the options for data prepartion:
python prepare_dataset.py --help
The default option is to prepare training, validation, testlive, and test set using qm9mmff dataset:
python prepare_dataset.py
The inputdir (defult is data/raw
)has initial dataset files in tfrecord/tar format. The output files will be in outputdir (defult is data/processed/qm9mmff
), which include train.tfrecord, testlive.tfrecord, validation, and test.tfrecord files.
You can also change the inputdir or outputfir as flowing:
python prepare_dataset.py --inputdir user_dir
python prepare_dataset.py --outputdir user_dir
To build training, validation, testlive, and test set from eMol9_CM:
python prepare_dataset.py --datatype emol9mmff
To build test set using Plati_CM
python prepare_dataset.py --datatype platinummmff
All scipts for model training are in src/models
directory. To check the options for model developmet:
python train_model.py --help
Many hyperparameters can be changed by user
The default option is to train DTNN_7ib model, using same hyperparameters as our paper:
python train_model.py --addnewm
Train TL_QM9M:
python train_model.py --geometry MMFF --transferlearning
Here, input geometries are changed from DFT optimized to MMFF optimized, and --transferlearning
is the flag for transfer learning.
Train TL_eMol9_CM:
python train_model.py --datatype emol9mmff --geometry MMFF --transferlearning
Here, eMol9_CM dataset has been used to train model for conformations.
All scipts for model evaluation are in src/models
directory. Our pretrained models are in models
. Here, we provides the best model trained with five different random seed.
To check the options for model evaluation:
python predict_model.py --help
Performanc of DTNN_7id on QM9:
python predict_model.py
Performanc of DTNN_7id on QM9M:
python predict_model.py --testpositions mmffpositions
--testpositions
is used for changing input geometries.
For QM9M, positions
is for DFT optimized geometries, and mmffpositions
is for MMFF optimized geometeis.
For eMol9_CM and Plati_CM, positions1
is for MMFF optimized geometries, and positions2
is for DFT optimized geomstries.
Performance of TL_QM9M on QM9M:
python predict_model.py --modelname TL_QM9_name --testpositions mmffpositions
Here, TL_QM9_name
should be changed into dir
of trained TL_QM9M model.
Peformance of TL_eMol9_CM on eMol9_CM:
python predict_model.py --modelname TL_eMOL9_CM_name --testtype emol9mmff --testpositions positions1
Peformance of TL_eMol9_CM on Plati_CM:
python predict_model.py --modelname TL_eMOL9_CM_name --testtype platinummmff --testpositions positions1