Owing to the complexity of the immune system, considering the effect of individual alleles on single proteins is increasingly being extended to consider the impact of alleles on gene-gene and gene-strain interactions (Deffur et al.,2013;Möller et al.,2010). Many studies have tried to explore these interactions by applying a reductionist approach, that is considering the effects of small sets of alleles on small sets of genes. Other studies have made use of high throughput technologies to extend these approaches. For example, a genome-wide screening approach was used to identify human sequence variants associated with gene expression levels (expression quantitative trait loci, eQTLs) inMtbinfected compared toMtb
uninfected dendritic cells (Barreiro et al.,2012). Systems biology approaches to
understanding the relationship between HIV,Mtb, and their human host may enable the prediction of emerging patterns that could complement reductionist approaches (Deffur et al.,2013). Interactions between the human and pathogen proteins are essential for the survival and viability of the pathogen within the host. As such, human-pathogen
protein-protein interaction networks of human-Mtbproteins (Huo et al.,2015;Rapanoel et al., 2013), as well as human-HIV-1 proteins (Fu et al.,2009), have been constructed, which may further understanding of the molecular mechanisms of survival and pathogenicity.
1.4.1 Protein interactions and interaction networks
The function of a protein cannot be fully understood by studying the protein in isolation, and interactions between proteins can be used to describe and narrow down a protein’s function.
Some of the functions of protein-protein interactions (PPIs) are to: modify the kinetic properties of enzymes, enable substrate channeling, create new binding sites for small effector molecules, inactivate or suppress a protein, change the binding specificity of a protein, and to regulate downstream or upstream processes (Rao et al.,2014). PPIs can be detected using a variety of methods, includingin vitromethods, such as affinity
chromatography and protein arrays, as well asin vivomethods, such as yeast two-hybrid systems. Moreover, PPIs can be predicted usingin silicomethods, such as sequence-based approaches, gene expression-based approaches, and structure-based approaches (Rao et al., 2014). Protein interactions can be described as physical or functional interactions; whereby physical interactions involve physical contact between proteins (e.g. binding) and functional interactions are indirect associations between proteins (e.g. involvement in the same
biological process). Because interactions can be discovered or predicted using a wide variety of approaches, it is necessary to be aware that each approach is subject to its own biases and shortcomings and may incorrectly classify interactions. By integrating data from multiple sources, triangulation provides improved confidence in the interactions, whereby interactions from different sources can be weighted based on their reliability and a confidence score can be assigned based on the number of sources that report that interaction (Szklarczyk et al., 2014). Furthermore, to reduce the likelihood of false positives and false negatives, a reliability threshold can be applied to this score to ensure that only high confidence interactions are included (Mazandu and Mulder,2011). A protein-protein interaction network (PPIN) contains all known and predicted protein interactions in an organism or between organisms. By incorporating all physical and functional interactions with appropriate thresholds into a functional PPIN, it is possible to illustrate a more complete picture of the biological context in which the protein functions (Mazandu and Mulder,2011).
Within a PPIN, proteins are depicted as the nodes and the interactions between them are depicted as edges. PPINs are constructed as network graphs, and, as such, concepts and algorithms from graph theory can be used to identify subnetworks of important proteins and interactions within the complete network (Mulder et al.,2014). Network representation of PPIs assists visualisation and identification of potentially biologically important proteins at a
high-throughput scale. Various databases exist containing intraspecies PPIs, such as STRING (Szklarczyk et al.,2014) and Reactome (Croft et al.,2014) (for a comprehensive list refer to the
review byRao et al.(2014)). PPINs tend to be modular in structure and exhibit a scale-free property in which many proteins are involved in few interactions, and few proteins are
involved in many interactions. As such, the relative importance of a protein within the network can be determined by using centrality measures, such as how many proteins it interacts with, how often it falls on the shortest path between two other protein interactions, and the average path distance to it and every other protein.
1.4.2 Host-pathogen interactions
Host-pathogen PPINs provide an opportunity to study the complexities of the relationship between the host and the pathogen during infection (Mulder et al.,2014). Experimental detection of host-pathogen PPIs is limited, however methods in computational prediction of host-pathogen interactions are increasing (Mulder et al.,2014).
Host-pathogen interactions can be predicted using interologues, which are conserved interactions between two proteins that have orthologues that interact in another organism (Rapanoel et al.,2013). After prediction, these interactions need to be filtered so that they have biological relevance, as many of the interactions will be unlikely to occurin vivo(Mulder et al.,2014;Rapanoel et al.,2013). For example,Rapanoel et al.(2013) used interologues to predict interactions betweenMtband human proteins, which were then filtered based on differential gene expression in microarray data.
1.4.2.1 Using protein structure to predict host-pathogen protein interactions
In addition, since protein domains determine the structure and function of proteins and interactions between two proteins are mediated by these domains (Mulder et al.,2014), host-pathogen interactions have be predicted by using protein domains (Dyer et al.,2007).
Human-pathogen PPIs have also been predicted based on structural and sequence similarity after applying appropriate filters for biological relevance (Davis et al.,2007;Doolittle and Gomez,2010;Mulder et al.,2014)
1.4.2.2 Prediction of host-pathogen protein interactions using machine learning techniques Machine learning techniques have also been used to predict human-pathogen PPIs (Mulder et al.,2014;Tastan et al.,2009). For example,Tastan et al.(2009) trained a random forest classifier to classify HIV-1 and human protein pairs as interacting or not, by using features such as human gene expression during HIV infection, as well as gene ontology features and
sequence similarity.
1.4.2.3 Assessing the relevance of predicted interactions
Several computational methods have been used to predict host–pathogen interactions.
However, because few known host–pathogen interactions are documented in the literature, assessing these prediction methods is difficult (Mulder et al.,2014). For example,Rapanoel et al.(2013) identified 47 experimentally observed human–Mtbinteractions in the literature,
but there was little to no overlap between these known interactions and predicted interactions in their dataset and in a similar dataset of predicted human-Mtbcompiled byHuo et al.
(2015). Human-HIV protein interactions are far better documented due to the HIV-1 Human Interaction Database, but even interactions in this database may need to be filtered to avoid false-positives (MacPherson et al.,2010), which will be discussed in more detail inchapter 2.
1.4.2.4 Using host-pathogen interactions to understand immune responses
Knowledge of host-pathogen PPIs may help to understand how pathogens attack the host and how the host’s immune system responds. For example, predicted interactions between human-Mtb, have been used to identify potential drug targets inMtbby filtering on proteins with high network centrality measures ofMtbproteins in anMtb-MtbPPIN (Mazandu and Mulder,2011). TheMtbnetwork displays scale-free and small world properties, which make the system vulnerable against targeted attack and facilitate easy network navigation,
regardless of the network size (Mulder et al.,2014). In such networks, few proteins are essential for the survival of the system, allowing proteins with high centrality measures to be considered as potential drug targets (Mulder et al.,2014).