Epidemiological Prediction using Deep Learning

In particular, the Graph Convolutional Network (GCN) model is used to capture the geographical feature of US epidemiology. Epidemiology is the study of the distribution and determinants of health-related conditions or events, and the application of this study can be used for the control of a disease or other health problem [3]. With the appropriate choice of the transmission rate (S → I) and the recovery rate (I → R), the SIR model can be solved using the ordinary differential equation [5].

Due to the nature of longitudinal data, time series can be represented with a relatively small number of variables. Before using the method, however, the time series data types must be defined. Here, machine learning is the study of an algorithm that a computer learns on its own without instructions.

The network formed by the connection of synapses can learn itself by changing the strength of the synaptic connection, which can be translated into the weight in the machine learning literature. However, the multilayer perceptron was difficult to train as the number of weights increases as it grows. the number of hidden layers increases. In most cases, 50–80% of nodes are used, and excluded nodes do not affect learning [35].

From the dot product, the corresponding value in the vector is multiplied by the gradient of the error function ∇E(wn).

Recurrent Neural Network (RNN)

Sathler used MLP, Long-Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) to predict the dengue fever using climate characteristics dengue data [ 49 ] and Baquero et al. However, the conventional RNN has a problem for long-term computation as bias gets accumulated, making the computation inaccurate as the new input is shadowed by previously stacked information. These standard RNN can learn the short-term dependence but not the long-term dependence due to gradient vanishing during backpropagation over time.

When there is a large time gap between the previous input and the transmitted input, the vanishing gradient problem occurs. LSTM solves this problem by introducing gates that can selectively evaluate the input information. Unlike vanilla RNN, LSTM has two types of states; the hidden state learns short-term information and the cell state learns long-term information.

This information is controlled by the ports; the input port determines how much to consider from cell stateCt; the forget gate, how much to forget from cell state Ct−1; the output port judges how much information to send to the next hidden device. ht represents the hidden state; theCt cell state; C˜t the candidate for cell state;. After deciding how much C−t should be left with the forget gate, the calculation of the input gate and C˜t are added to make a new Ct.

The performance of the LSTM exceeds that of the RNN in terms of their predictive power. Volkova et al., to the best of our knowledge, first used the LSTM to predict ILI dynamics using the different language signals from social media, which suggests that the communication behavior is powerful in predicting and can be used for learning neural network models where ILI historical data is not available [57]. The LSTM shows good predictive performance in several diseases, such as typhoid fever, hemorrhagic fever, mumps, conjunctivitis [58] and hand-foot-mouth disease [59], and shows improved performance when combined with a genetic algorithm to estimate the initial parameter generation [58]. 60].

The main difference is that GRU combines forgotten ports and input ports into a single update port, achieving a simpler structure. The update port zt determines the ratio of the previous memory to the current input s con-. The GRU is capable of capturing long-term sequence data and has a smaller memory capacity.

Convolutional Neural Network (CNN)

For example, average pooling is used to smooth the image and max pooling is used to sharpen the image, which is suitable for MNIST data where the contrast between the background and the handwriting is important. Among several CNN models, a residual network (ResNet) model consistent with RNN is used to capture the spatial features of epidemics. The graph can be represented as G= (V, E) where V represents the set of vertices or vertices and E the set of edges or relations.

The vertices can imply the part of the system, such as an individual or a region, and the edges can imply the relationship or interaction of the nodes, such as connection or link. 6 shows an example of a graph and its relevant matrix, where the adjacency matrix is AN×N and the feature matrix is XN×F. The adjacency matrix is a symmetric pairwise distance matrix with element values of 1 or 0, where 1 represents a connection and 0 no connection.

To update the node history, the identity matrix is summed into the adjacency matrix to change the diagonal values to 1. For the case of a weighted graph, instead of the adjacency matrix, a similarity matrix of one element between 0 and 1 is used. As as the name suggests, GCN performs convolutional computations and updates the feature values of the nodes while the shape of the graph remains unchanged.

A single hidden graph layer receives first-order neighborhood information by multiplying adjacency matrix and feature matrix, and as the layer stacks up, the node can receive information from multiple orders of neighboring nodes. In Graph Neural Network (GNN), the original model of GCN, the embedding of the new node is made of the sum of aggregated neighboring nodes and linked to the original node itself. Unlike GNN, which uses different aggregated and concatenated feature and weight, GCN uses the same weight for the node and the neighboring nodes.

Equation (17) represents the recursive calculation of how the lth neighboring nodes are updated tol+ 1st layer, where Hi(l+1) represents the hidden state inl+ 1 layer and in node, W(l) the weight matrix, b(l) the bias vector and σ the activation function. Since the adjacency matrix A is not normalized, it can change the scale of the feature vector. With all the advantages of GCN, which can capture meaningful information for non-Euclidean distance, it has not been explored in epidemiological literature to the best of our knowledge.

Graph Convolutional Network Gated Recurrent Unit (GCNGRU)

Data Description

In practice, the table from the ILI activity map and adjacency matrix, which contains neighbor information, was used. a) Weekly ILI activity level (b).

Evaluation metrics

The purpose of this evaluation is to check the performance of the GCNGRU model by comparing with the existing model, CNNRNN-res, which is suggested by Wu et al. As introduced in the last section of 2.2 CNN, CNNRNN-res is a combination of ResNet (CNN) and GRU (RNN). First, to the best of our knowledge, CNNRNN-res is the first and only model that captures both spatial and temporal features of data by means of deep learning methodologies.

For example, the 1 value for the horizon represents the prediction horizon of 1 week, which is the time step of the original data. This value assesses the ability of the GRU cell to capture the structural and semantic features of the input data. For the mini-batch optimization, the batch size is between 1 and the size of the training set.

For this experiment, 0.001 value of the learning rate was found to be the optimal hyperparameter. As mentioned at the beginning of the model section, this paper used other types of optimizer besides gradient descent, since the gradient descent is needed to calculate loss function for each step, which led to a large amount of computational cost. Choi, “Development of epidemic model using the stochastic method,” Journal of the Korean Data and Information Science Society , vol.

Gershenfeld, “The Future of Time Series,” in Proceedings of the NATO Advanced Research Workshop on Comparative Time Series Analysis. Slutzky, “The Sum of Random Causes as a Source of Cyclical Processes,” Econometrica: Journal of the Econometric Society, p. Whittle, "The analysis of multiple stationary time series," Journal of the Royal Statistical Society: Series B (Methodological), vol.

Pitts, "A logical calculation of the ideas inherent in nervous activity," The bulletin of mathematical biophysics, vol. Hopfield, “Neural networks and physical systems with emerging collective computational capabilities,” Proceedings of the National Academy of Sciences, vol. Marathe, "Defsi: Deep learning-based epidemic prediction with synthetic information," Proceedings of the 30th innovative Applications of Artificial Intelligence (IAAI), 2019.

Chunara, “Deep landscape features to improve vector-borne disease prediction,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop, 2019, p. Mei, “A Deep Residual Network Incorporating Spatiotemporal Features for Predicting Influenza Trends at the Urban Scale,” in Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Artificial Intelligence for Geographic Knowledge Discovery.