Encoding a 1-D Heisenberg Spin 1/2 Chain in a Simulated Annealing Algorithm for Machine Learning

The application areas of machine learning techniques are becoming broader and increasingly ubiquitous in the natural sciences and engineering. One such field of interest within the physics community is the training and implementation of neural networks to assist with quantum many-body calculations. Conversely, research exploring the possible computational benefits of using quantum many-body dynamics in the fields of artificial intelligence and machine learning has also recently begun to gain traction.

After the rotations are converted to binary form, a bit count is calculated for each state of the system. An exact diagonalization of the XXZ Heisenberg Hamiltonian is then used to calculate the energy eigenvectors and eigenvalues. The model is trained to identify the number of bits of the system's lowest energy state, which is found through a stochastic search algorithm known as simulated annealing.

INTRODUCTION

LINEAR ALGEBRA

QUANTUM MECHANICS

If we choose an orthonormal basis where each component is linearly independent of the rest, the inner product of two vectors can be written neatly in terms of their components. In quantum theory, the state of a system is represented by its wave function and these wave functions live in a Ψ (possibly infinite-dimensional) Hilbert space. Therefore, measurable quantities such as the energy, momentum or position of a given quantum system can be measured by applying the corresponding operator.

The expected value of an observable is then like taking a series of measurements and averaging them. In linear algebra, a linear transformation can also be represented by matrices acting on a vector to produce a new vector, and in quantum mechanics this extends to Ψ which can then be represented as . The operator of interest in this study is the Hamiltonian operator, and much of the focus of this work is how solutions to the time-independent Schrodinger equation.

Here we will explore a special technique known as exact diagonalization which is a way to diagonalize the Hamiltonian matrix operator that takes advantage of certain symmetries to find solutions for small spin systems. We will also construct our model Hamiltonian to incorporate rotation operators to examine these systems.

QUANTUM SPIN AND THE HEISENBERG MODEL

SPIN ½ OPERATORS

From here we will derive the spin operators ︿Sx, ︿Sy, ︿Sz, and the so-called “scale operators”, ︿S±, which we will use to construct the Hamiltonian. 1 s , s, we see that these scale operators have the effect of "raising" a spin down vector.

INTERACTION HAMILTONIAN

Due to the duality transition of the Pauli rotation matrices [8] we can simplify this to the form. The addition of this external magnetic field is necessary for our particular model as we will use it later to adjust the orientation of certain spin sites in order to make predictions about their states. Also, notice here how we change our upper bound notation from N to L since L (representing the length of the chain) will be used to represent the number of turns in our system, and our model will then be the sum of interactions between spins in the 1D spin chain.

EXACT DIAGONALIZATION WITH QUSPIN

EXACT DIAGONALIZATION OF THE XXZ MODEL

PROGRAMMING WITH PYTHON AND QUSPIN

DESIGNING A NEURAL NETWORK OF SPINS

MACHINE LEARNING AND NEURAL NETWORKS

The primary decisions we are interested in are in the area of pattern recognition, and the reason pattern recognition is so important is that it is the mechanism by which people learn. When the human brain receives some form of data, whether it's viewing a selection of fresh produce or observing trends in the stock market, it can distinguish between different bits of data, classify them according to their properties, and then make a decision based on that classification. At a microscopic level, the learning mechanism in a neural network is quite simple; neurons send tiny electrical signals to each other, and these tiny signals carry some information.

The same signals are then passed from one neuron to another in a complex network of neural activity similar to that shown in Figure 4. Here the activation function shown is a step function, but other similar functions, "softer " as the sigmoid function can be used and are often preferred. This percept model, together with its activation function, then forms the standard by which most neural networks are designed.

The basic structure of a neural network is a set of nodes that simulate neurons and the connections between various numbers of nodes. The size of this weight determines whether the threshold of the activation function is reached or not. Finally, by placing the nodes in a series of layers, a series of neuron activations can be found that lead to a subnet being formed within the neural network and, depending on which subnets are formed, the network will output different values. .

The appeal of a neural network is that a model can be trained on a set of input data and then applied to a new set of data and make relatively accurate predictions about what the output will be. A typical example of this would be image processing, where we try to distinguish between images containing a cat and images containing a dog. Applications of machine learning and neural networks have now reached a much wider field than just playing checkers or recognizing cats.

A growing area of interesting research within the physics community is the combination of quantum many-body dynamics and machine learning. Much of this research has shown that these two fields complement and motivate each other in new and revealing ways. One of these new areas of interest is investigating how neural networks can help solve complex quantum many-body problems, or their inverse, and how quantum mechanics can help with the computational efficiency of machine learning techniques such as neural networks.

NEURAL NETWORK OF SPINS

Many fields of research in engineering and science have recently adopted these computational tools and are beginning to find new ways to apply them. In this way our algorithm can take large, complex quantum systems with 2L ground states and represent them simply as a series of small numbers in order to train a model that can later be used to handle data of other similar numerical ones. First, we convert the system states by iterating through each binary vector and adding all the spin values, 0s and 1s, to produce a bit count.

The elements of this bitcount array will act as target values that our model will try to predict. Lin spins are set and fixed using the external magnetic field component hz Hamiltonian and store the target bit numbers. The Lout revolutions are then configured to mimic the target bit numbers of the Lin revolutions by encoding the ︿Sz measure of the possible 2Lout outputs.

To determine the state of each Lout spin, we calculate the expectation of ︿Sz acting on the 2Lspins corresponding to the ground state eigenvector, . In this way the rotation and spin given by ↑ ↓ ± 1/2 at each location corresponding to the lowest energy state can be compared to the value of the bit at that location. We then repeat each configuration and make this comparison in order to set the ︿Sz values to Lout.

Next, we select a classifier model that is both effective and appropriate for the dynamics of our system.

STOCHASTIC SEARCH METHODS

We could then propose a greedy algorithm that will randomly select different sites and assign them one of these values, iterate through all possible configurations and try to find the lowest energy. However, this kind of greedy algorithm often proves insufficient in solving these optimization problems, as it is likely to get stuck at some local minimum instead of finding the global minimum corresponding to the lowest energy state. In physics, a process known as annealing is used to find the lowest energy of a system.

The motivation behind this process is that at very high temperatures the relative orientation of the magnetic dipoles in the material is highly random. As the material cools, these small magnets will begin to either align or oppose each other in such a way as to reduce the energy distributed between them. The higher the temperature and/or the slower the cooling schedule, the more likely the system will find a global minimum.

Therefore, the probability of finding the state at energy Eγ decreases exponentially, and the sum in the denominator is due to the exponential decrease in the number of states with increasing energy. Although this interpretation works well to describe systems with sites that are independent of each other, we wish to evaluate systems with a varying interdependence in site-site connections.

SIMULATED ANNEALING ALGORITHM

SIMULATED ANNEALING

After this step, the temperature is gradually lowered according to the cooling rate and the process is repeated. As the temperature decreases, the probability that a new energy will be accepted increases, and the simulation will eventually terminate when the temperature is either very low or the algorithm reaches its maximum number of steps.

CODE DESCRIPTION

PLOT RESULTS

CONCLUSION