• Tidak ada hasil yang ditemukan

Local versus global contexts in graphs for learning network representations 5

1.4 Local versus global contexts in graphs for learning net- work representations

Most of the recent research efforts [16–21] in NRL is directed towards learning local neigh- borhood contexts surrounding any graph component of interest for a target downstream task.

Few research works capture global contexts that have dependencies on the underlying network connectivities. Few examples include — i) incorporating neighborhood contexts surrounding a node [22,18,23,20] for learning node embedding to classify the nodes, ii) incorporating common subgraphs surrounding a (source, target) node-pair for learning to embed edges for relation prediction [24–26], iii) considering only a certain length of the window from sampled random-walks for learning contextualized node embedding [16,17,12,27], iv) considering network-only proximity to group nodes of a graph for node clustering task [10,28–31]. i), ii), and iii) provide a microscopic view of the network, and the learned graph-component embeddings lack in discriminative capacity. Due to several challenges that frequently exhibit in real-world graphs of different types (see Sections3.2,4.2,5.2, for details), these narrow views of graphs may not prove to be useful for a downstream task. Figure1.4elucidates

A

B

Fig. 1.4 A toy example of showing usefulness of community information for A) node classification, and, B) link prediction. Communities can see beyond the local neighborhoods. They are especially useful to resolve sparsity issues in graphs.

this point for more clarity. In SubFigure A of Figure1.4, relational features for classifying node colored in Red is considered. Based on the Red node’s functional classes (colored in Green and Blue) of one-hop neighbors, it is easy for a prediction model to annotate the

Red node with Blue functional class since Blue class labels surrounding the target node are abundant. However, in a more challenging scenario depicted on the right-hand side, the functional class annotations are missing for the Red node’s one-hop neighbors. Due to this, the relational feature-based prediction algorithm finds it hard to classify the Red node.

But, given some more ground-truth information about the Red node that it is part of the community underneath (highlighted in Yellow) and the majority of the member nodes of that community are annotated with Blue functional classes — gives some graph structure based cues to the prediction model that directly aids in predicting classes for the Red node.

In SubFigure B of Figure1.4, we consider learning embeddings based on common two-hop neighborhood structures (includes nodes, relations, and paths within) to predict the existence of a link between two Red nodes. Common neighbors up to two-hop are highlighted in Green.

On the left-hand side, we observe that given the local contexts surrounding the node pair, it is easy to justify the formation of a link between the target nodes. However, in a more challenging scenario depicted on the right-hand side, we see that the network suffers from link sparsity issues. Due to this, a very less number of common neighbors are available to the link inference algorithm. The lack of local context makes it hard for the inference algorithm to predict links for the target nodes confidently. But, given some more higher-order structural cues that the Red nodes are part of the same community where other members, including the Red nodes’ neighbors, are densely connected to the rest of the community — provides more evidence to the inference algorithm in support of the existence of a link between the Red nodes. Therefore, the macroscopic network view, aka the higher-order structural features of the underlying network, is important and plays a critical role in the learning mechanism.

In a graph, there are prominent higher-order structures as well, which can enrich the graph- component representations for a downstream task. Figure1.5shows various higher-order graph structures at multiple-scales (finer to coarser). To give a few examples, i) incorporating communities exhibited in a graph for node and/or edge embedding, ii) incorporating a set of higher-order relational paths for learning edge representations, iii) considering motifs, subgraphs, network-schemas for node embedding, iv) considering node proximities other than network-connectivity to group nodes — are some of the ways to incorporate the graph structures into graph component embeddings. These structural intuitions are not automatically incorporated i.e., local context based network representation learning models do not explicitly learn from them.

1.5 Broad Challenges and Motivations 7

Fig. 1.5 Higher-order structures at multi-scale for graphs. Examples include: walks, hierarchies, hyperedges, motifs, communities (in clock-wise manner).

Image Source: partially from the Internet.

1.5 Broad Challenges and Motivations

In this dissertation, we aim to incorporate structural cues from the underlying graph for network representation learning. We consider three types of graphs — simple homogeneous, heterogeneous, and multiplex networks as our subjects of study. As downstream ML tasks, we consider the problem of node classification for homogeneous and multiplex graphs, and the problem of link prediction for heterogeneous graphs. A thorough literature review (refer to Sections3.6,4.6,5.6) of the NRL methods on these subject graphs reveal that there exist critical research gaps and immense scopes of research to address the central research theme of this dissertation —How to incorporate higher-order multi-scale graph structure based intuitions to enrich target graph component embeddings for chosen downstream ML tasks?

Some of the critical research gaps for our chosen subject graphs are elucidated next, which essentially motivates us to design structure-aware NRL frameworks in the respective graph domains as the primary contributions.

Learning structures in homogeneous graphs primarily focus on, either capturing k-hop local contexts of nodes using i) random-walk-based methods [16,17], and ii) Graph Neural Networks (GNNs) [18]. Or, encoding explicit clustering criteria in i) matrix factorization methods [20, 29] and ii) auto-encoders [32], iii) GNNs [33–35, 30]. However, random- walks, GNNs, auto-encoders, proximity-based methods are limited to only capturing k-hop local contexts for nodes. All the community enforcing models use unsupervised clustering

criteria based on either network-only node proximities or embedding-based node proximities.

Whereas the classic assumptions of Semi-Supervised Learning (SSL) [36], advocate for the clusterabilityof nodes with similar target functions in dense regions. TheCluster Assumption of SSL [36] considers supervision information available in the graph data in association with the underlying linked data distribution as an indicator for community membership for the nodes. However, none of the existing SSL methods incorporate supervision knowledge-based global neighborhoods of nodes to enrich their local embeddings.This calls for considering various sources of supervision knowledge exhibited in graphs to form logical groupings of nodes and see what advantages it offers in contrast to the unsupervised communities that are limited by ground-truth network connectivity patterns.To address this, we design a Non- Negative Matrix Factorization based joint node and cluster embedding learning framework which explicitly incorporates all necessary priors of SSL, especially thecluster assumption.

For multiplex networks, among various learning paradigms such as graph convolu- tions [37–39], random walks [40,41] or matrix factorization [21], only a few recent works encode global structures [38,39] in multiplex graphs via explicit clustering using Graph Neural Networks (GNNs) [38] and InfoMax6based learning respectively. InfoMax based learning [2,43,39,44] uses GNNs and provides a scalable way to incorporate both local and global node representations via maximizing Mutual Information (MI) between them.

Nevertheless, in a typical setup, it considers a unique shared global graph summary obtained from a trivial aggregation of all the learned node representations and uses the same sum- mary to optimize MI with all the local node representations. This trivial graph summary indeed encodes a lot of noisy information in the learned embeddings. This calls for a unique contextualized global graph representation for each node. To address this, we propose Cluster-Aware InfoMaxlearning objective and design a novel GNN framework based on the objective to jointly model nodes and clusters for learning enriched node representations.

State-of-the-art (SoTA) NRL link prediction methods [26, 19,45] for heterogeneous graphs primarily consider graph neural networks for learning local neighborhood contexts such as enclosing subgraph context surrounding the (source, target) node-pair for learning edge representations. Very few attempts have been made so far to use different perspectives such as relational paths [46–48], network-schemas [49,50] other than the surrounding context for predicting links. A number of link prediction models [24,25] look to designing structure- aware GNNs by introducing a node-labeling scheme for distinguishing structural roles of the nodes in the common subgraph surrounding a node-pair under consideration. However, they

6InfoMax is an optimization principle that prescribes that a function that maps a set of input valuesIto a set of output valuesOshould be chosen or learned so as to maximize the average Shannon mutual information betweenIandO, subject to a set of specified constraints and/or noise processes. [42]

1.6 Research Objectives 9