This is the latest version of tutorial for ΔvinaXGB, old version can be found here
First Part and Second Part of this tutorial is about installation of dependencies and set up the ΔvinaXGB.
Third Part and Forth Part are the dataset and examples of applying ΔvinaXGB scoring function to rescore the protein-ligand binding affinity.

1. Setup

2. All Dependencies

3. Dats Set

  • Before calculating features, three structure inputfiles are needed:
  • pdbid_ligand.mol2/sdf --> ligand structure file
    pdbid_protein.pdb --> protein structure
    pdbid_protein_all.pdb --> protein with water molecules structure file
    Examples for structure files can be found in Test_2al5 directory.
    Note: these three files are needed to run predictions. All files should include hydrogens. If protein files with water molecules are not available, just copy the original protein.pdb to protein_all.pdb.
  • If features have been already calculated, only features files are needed:
  • Input.csv--> Input feature file
    Example for Input.csv can be found in Test directory.

    4. Run Model

    After all of above have been set up, the example can be run in deltaVinaXGB/DXGB.

    You can check the help by

    The script can be run for one complex by

    --runfeatures is feature calculation, default is to calculate all features, --datadir is for structure files datadir, --pdbid is for structure pdbid, can be other type of index. --average is to calculate average scores from 10 models.

    Or it can also be run by providing a list of protein-ligand complex with input features as in Input.csv

    Default is to predict scores for provided structures. If you want to get scores with explicit water molecules, and optimized ligands:

    --water is for consideration of water effect, rbw is to consider both receptor-bound water and bridging water molecules, --opt is for optimization, rbwo is to optimize ligand in no water, bridging water, and receptor-bound water environemnts.

    The calculated features will be saved in Input.csv file, and the predicted scores will be saved in score.csv file. If you want to get deltaVinaRF scores as well, add --runrf.