|
In drug development, the efficacy of an antibody depends on how the antibody interacts with the target antigen. The strength of these interactions, measured through “binding affinity”, gives an indication of how successful an antibody is in neutralizing an antigen. Due to the high computational complexity of traditional techniques for binding affinity quantification, deep learning is recently employed for the task at hand. Despite the commendable improvements in deep learning-based binding affinity prediction, such approaches are highly dependent on the quality of the antibody-antigen structures and they tend to overlook the importance of capturing the evolutionary details of proteins upon mutation. Further, most of the existing datasets for the task only include antibody-antigen pairs related to one antigen variant and, thus, are not suitable for developing comprehensive data-driven approaches. To circumvent the said complexities, we first curate the largest and most generalized (i.e., including a wide array of antigen variants) datasets for antibody-antigen binding affinity prediction, consisting of more than $100K$ sequence pairs, $8K$ structure pairs and the corresponding continuous binding affinity values. Subsequently, we propose a novel deep geometric neural network comprising a structure-based model, which is to account atomistic-scale structural features, and a sequence-based model, which is to attribute sequential and evolutionary information, while sharing the learned information from each model through cross-attention blocks. Further, within each parallel model, we mimic the interaction space of antibodies and antigens through a set of multi-scale hierarchical attention blocks and the final latent vectors of each model are obtained by considering antibody and antigen representative vectors and the interaction vector. The proposed framework exhibited a $10%$ improvement in mean absolute error compared to the state-of-the-art models while showing a strong correlation ($>0.87$) between the predictions and target values. Additionally, we extensively discuss the model optimization strategies, weight space analysis, and interpretability in a post-hoc fashion. We release our datasets and code publicly to support the development of antibody-antigen binding affinity prediction frameworks for the benefit of science and society.
|