Multibody Interactions

Multibody interactions versus many-body problems

While multibody interactions are important in many applications, the kinds we have included in our hardware are not something we often encounter in day-to-day life. In other words, they are very uncommon in physics, despite being relatively common in computing. The ability to directly implement such interactions is part of what makes our hardware powerful. Often, people discuss the idea of a three-body (or more generally “many-body”) problem, which was historically defined in orbital mechanics, and the apparent impossibility of generally solving the behavior of more than two objects subject to gravity. There is a similar quantum many-body problem that solves for the behavior of atoms. This can be done by hand with a single electron, but once more than one is present the calculation becomes much more difficult.

The gravitational many-body problem, however, still only involves two-body interactions. Each body (planet, star, moon, asteroid, whatever it is) only interacts with each of the others separately. In other words, the equation for the forces can be written as a sum of forces between each pair of objects. For example, if we consider a system of three asteroids, the force on any of them will simply be the sum of two forces each of which will be proportional to the product of two of the masses divided by the distance squared. 

asteroids
F1=Gm1m2r2r12+Gm1m3r3r12F_1=\frac{Gm_1m_2}{\left|r_2-r_1\right|^2}+\frac{Gm_1m_3}{\left|r_3-r_1 \right|^2}

These forces which are added up are each two-body interactions, and they each only involve masses and positions of two of the objects. In fact, it is very difficult to find examples in commonly encountered physics where multibody interactions are present. Newton’s third law of motion is fundamentally a two-body statement, since there is a force and an “equal and opposite” force, implying that equations for forces can be broken down to terms involving pairs of objects. In addition to physical analogies being hard to come by, the fact that they don’t appear much in physics (even quantum physics) also makes multibody interactions hard to engineer. 

Multibody expressions that would map to multibody interactions are, however, somewhat common in mathematics and computing, and as we discuss in some examples later, they are useful. For example, it turns out that arbitrary logical statements can be built up from clauses that reference three bits, but not if each statement only mentions two. Checking whether sums of many bits are even or odd is useful in error correction, and the most natural statement of the problem of factoring a number into primes involves statements with terms over four bits. 

As a more tangible example of a logical expression which is fundamentally three-body, consider the three ingredients needed to start a fire: heat, air, and fuel. If a chemical plant wants to avoid fires it needs to avoid having all three of these together at the same time, but any two are fine. Heating something flammable without air is fine, and may be necessary to get a desired reaction to happen. Most flammable substances (let’s assume we are not working with anything too extreme) can be exposed to air at room temperature, and heating air just results in hot air, which is not problematic by itself. The logical statement to avoid fire is therefore to not have high temperatures, fuel, and air in the same place simultaneously, a statement which fundamentally references three distinct conditions.

fire

Linear and quadratic terms

When thinking of modeling an optimization problem, we can think of the order of interactions needed between the terms of the objective function. In other words, what is the maximum number of terms multiplied together in the equations that express the problem? Single terms are called linear and pairwise products are called quadratic. Without constraints, any purely linear objective function is easy to solve; for example, if we consider the following linear unconstrained binary optimization (LUBO, defined similarly to a QUBO, where each variable can be 00 or 11),

E=x13x2E=x_1-3x_2

it is fairly clear that this function can be minimized by setting x1=0x_1=0 and x2=1x_2=1. Moreover, even if there are many terms, the function could be minimized by just making all terms with a negative prefactor one and those with a positive prefactor zero. For this reason, LUBOs and any unconstrained linear models are not commonly studied because they are not very interesting. Linear problems which include constraints can be computationally hard and of great interest; generally, this is the field of linear programming. If we instead consider objective functions that can be expressed in terms of products of pairs of variables, we obtain the well-studied setting of quadratic unconstrained binary optimization problems, QUBOs. QUBOs are known to be NP-hard, meaning that they can be mapped from any other optimization problem. To see some of the increased richness that a QUBO gives us over a purely linear model, we can note that not every term in a QUBO can necessarily give us the minimal contribution possible to the objective function; for example

E=x1x2x2x3x1x3E=x_1x_2-x_2x_3-x_1x_3

gives a minimum objective value of 1-1 when x1=x2=x3=1x_1=x_2=x_3=1 (or x1=x3=1x_1=x_3=1 and x2=0x_2=0), although one might naively think that the two negative terms could give a value of 2-2, that is actually impossible without incurring a penalty from the positive term. This competition (known as frustration in the language of physics) makes it non-trivial to find the arrangement that solves a general QUBO. The question then becomes what is the value of higher order terms such as x1x2x3x_1x_2x_3 or x1x2x3x4x_1x_2x_3x_4. By virtue of being NP-hard, we know that QUBOs can map anything, but beyond a relatively mild guarantee of not exploding exponentially, this doesn’t guarantee how practical such mappings would be. As will be important later on, strong second-order interactions can be used to implement constraints. It turns out higher-order terms of this form come about quite naturally in many settings. However, before we do that, we should discuss how to map these interactions to quadratic interactions, to give an idea of the overheads one faces without having a device that natively supports them.

Mapping high-order to second-order

For binary variables, mapping from high-order terms to lower-order terms can be done by adding new variables; for example, this paper proposes a way to map a term of the form x1x2x3x_1x_2x_3, by adding a single additional variable, and a term of the form x1x2x3...xnx_1x_2x_3...x_n by adding nn such variables. While we will not discuss these strategies in detail, it is worth knowing how they work conceptually. The idea is that strong coupling is used to force the values of the additional variables; effectively, the additional (sometimes called auxiliary) variables are constrained. Recall that strong second-order coupling can enforce constraints so that their value is determined by the value of the variables that are actually used for computation. By using linear or quadratic terms on these additional variables, we can effectively realize higher-order coupling. As a concrete example, the third-order coupling mapping from this paper constrains the auxiliary variable to take the value of a majority vote. Hence, given the linear penalties on this auxiliary and the already available quadratic and linear terms, any third-order interaction can be realized.

While there may be many mapping strategies, a key takeaway is that if only quadratic interactions are available, at least one auxiliary variable will be needed per higher-order term. This can greatly grow the number of variables needed to describe the problem, particularly if a high number of high-order terms are needed. There are other more subtle disadvantages to mapping; firstly, the couplings to implement the constraints must be strong. On real devices, if the couplings are made too strong they could wash out the coupling used to actually describe the objective function. Secondly, most hardware solvers work by updating variables one at a time, so adding an auxiliary variable that must also change its value can disrupt performance. This is particularly true when it must pass through an intermediate configuration that necessarily has high energy due to constraints being violated. 

Also, on most platforms, it is difficult to directly implement higher-than-quadratic interactions. Speculating on the exact reasons would be too much of a digression into physics, but many proposals that have been made for implementing higher-order interactions in superconducting systems just use built-in auxiliary variables, such as this proposal, and this proposal. Despite being difficult to realize in most physical settings, there are many important high-order problems. We discuss some simple ones below, but there are many others. For example, one of the first instances of mapping high-order to low-order terms in the context of quantum annealing was to map a protein folding problem.

Satisfiability

A class of problems that lie at the heart of computer science are known as satisfiability problems. These problems consist of clauses, which are statements related to a series of variables. For example, x1ORx2=TRUEx_1\, OR\, x_2=TRUE requires that either variable x1x_1 or variable x2x_2 should take a 11 (TRUETRUE) value, but not necessarily both. Satisfiability problems are conventionally written in what is known as conjunctive normal form (CNF). This means that the problem consists of determining if a collection of conjunctive clauses (statements using OROR and possibly NOTNOT, which requires a 00) can be simultaneously satisfied. For example, the fact that this statement is true is usually taken as being implied.

(x1ORNOTx2)AND(NOTx1ORx2ORNOTx3)(x_1\, OR\, NOT\, x_2)\, AND\, (NOT\, x_1\, OR\, x_2\, OR\, NOT \,x_3)

A satisfiability problem can be written as a statement about a polynomial expression of the variables (assuming they take 00 and 11 values) in the following way; each conjunctive clause can be written as a product of terms that evaluates to 1 if the clause is violated, and 0 otherwise. For example, (1x1)x2(1-x_1)x_2 will yield a 11 value if and only if x1=0x_1=0and x2=1x_2=1; in other words, if (x1ORNOTx2)=FALSE(x_1\, OR\, NOT\, x_2)=FALSE. Conjunctive clauses can be converted into such polynomial terms (assuming each appears at most once) by multiplying (1xi)(1-x_i) every time xix_i appears in the expression and multiplying by xix_i whenever NOTxiNOT\, x_i appears. Disjunctive (ANDAND) terms can simply be implemented by addition. Since these products of terms can only take 00 or 11 values, any sum greater than 00 indicates that the clause is not satisfied. As an example, the CNF satisfiability problem given earlier is equivalent to asking if a set of variables can be found so that 

(1x1)x2+x1(1x2)x3=x2x1x2+x1x3x1x2x3=0(1-x_1)x_2+x_1(1-x_2)x_3=x_2-x_1x_2+x_1x_3-x_1x_2x_3=0

Clearly, the order of the interaction needed is equal to the number of variables, which are combined using OROR operations in a single clause. Two-satisfiability, the version where each clause only involves two variables, is not a hard computational problem. Three-satisfiability, on the other hand, was one of the first problems shown to be NP-complete and is often used in reductions to show NP-hardness. For this reason, third-order interactions allow us to access the interesting set of satisfiability problems.

Factoring

Factoring consists of finding an expression for an integer as a product of two smaller integers. For example, given a number NN, a factoring task could be to find two integers X1X_1 and X2X_2 such that

X1X2=NX_1X_2=N

While strongly suspected to not be NP-hard, factoring is a highly important problem in cryptography, and most widely used cryptographic methods would be broken if it could be done efficiently for large numbers. In fact, the existence of Shor’s algorithm, a quantum algorithm that could efficiently perform this task on an error-free universal quantum computer, was an early motivation for the field of quantum computing.

Factoring can be cast as a minimization problem, with an objective function of 

(X1X2N)2=X12X222NX1X2N2(X_1X_2-N)^2=X^2_1X^2_2-2NX_1X_2-N^2

This objective takes the minimum value of zero only when X1X_1 and X2X_2 are factors of NN. The resulting expression, however, is fourth order since it involves the product of squared terms. Note that our device has the capability of handling integer terms directly, but if we ignore that for the moment we can construct a simple binary example. If we allow X1X_1 and X2X_2 each to take values of up to three, they can be written as 

X1x1(1)+2x1(2)X_1\rightarrow x_{1(1)}+2x_{1(2)}

similarly for X2X_2. Setting the ones place x1(1)=1x_{1(1)}=1 and the twos place x1(2)=0x_{1(2)}=0 will give a value of X1=1X_1=1, x1(1)=0x_{1(1)}=0 and x1(2)=1x_{1(2)}=1 gives X1=2X_1=2, and x1(1)=1x_{1(1)}=1 and x1(2)=1x_{1(2)}=1 gives X1=3X_1=3. Using this, we can encode the factoring of 6=2×36=2\times 3  as the task of finding the binary values which minimize

(x1(1)x2(1)+2x1(1)x2(2)+2x1(2)x2(1)+4x1(2)x2(2)6)2(x_{1(1)}x_{2(1)}+2x_{1(1)}x_{2(2)}+2x_{1(2)}x_{2(1)}+4x_{1(2)}x_{2(2)}-6)^2

While fully expanding this expression out would be laborious, it is relatively straightforward to check that this expression can be made equal to zero by setting x1(1)=0x_{1(1)}=0 and x1(2)=x2(1)=x2(3)=1x_{1(2)}=x_{2(1)}=x_{2(3)}=1 since 4+2=64+2=6; this further encodes X1=2X_1=2 and X2=3X_2=3, which is the correct factorization. It is also clear that it would involve terms up to fourth order since it is already a quadratic expression and is squared. In particular, there will be a term of the form x1(1)x1(2)x2(1)x2(2)x_{1(1)}x_{1(2)}x_{2(1)}x_{2(2)}. Therefore, the most direct way of expressing factoring problems involves fourth-order terms.

Decoding of Error Correction

Error correction is the task of using redundant information to recover a partially corrupted message. When bits are lost or flipped, the most effective information to send is parity check information. This information is whether the sum of a set of bit values is odd or even. Unlike the previous examples, error correction decoding is most naturally expressed in terms of Ising variables zi=2(12xi)z_i=2(\frac{1}{2}-x_i). Given that our original xix_i variables can take values 00 and 11, the ziz_i variables take values 11 and 1-1. Based on this mapping we can see that zizj...zmz_iz_j...z_m will be equal to 1-1 if xi+xj+...x_i+x_j+... is odd and 11 if it is even. These are usually called “parity checking” bits. And, since the order of the interaction scales with the number of bits in the sum, decoding of error correction is fundamentally a multi-body problem. As an example, we consider decoding the famous Hamming [7,4] code, which is a code where three parity checking bits (also potentially subject to noise) can be used to correct a single error on four transmitted bits (as well as correctly identifying that no correction is needed if a single parity check bit is corrupted).

The Hamming [7,4] code consists of transmitting four data bits d1d_1, d2d_2, d3d_3, and d4d_4. Three “parity” bits are also transmitted to aid in correction: p1=mod2(d1+d2+d4)p_1=\mathrm{mod}_2(d_1+d_2+d_4), p2=mod2(d1+d3+d4)p_2=\mathrm{mod}_2(d_1+d_3+d_4) and p3=mod2(d2+d3+d4)p_3=\mathrm{mod}_2(d_2+d_3+d_4). The task of decoding is then to determine the original transmitted bits that correspond to the least number of bit flip errors occurring. Ising variables ziz_i can be used as the original bits where it is understood that 1-1 corresponds to an original bit value of 11, while a ziz_i value of 11 corresponds to an original bit value of zero. Having received all seven bits, the task of decoding consists of minimizing (where dd corresponds to a bit that takes a 00 or 11 value).

E=2[(d112)z1+(d212)z2+(d312)z3+(d412)z4E=2[(d_1-\frac{1}{2})z_1+(d_2-\frac{1}{2})z_2+(d_3-\frac{1}{2})z_3+(d_4-\frac{1}{2})z_4
+(p112)z1z2z4+(p212)z1z3z4+(p312)z2z3z4]+(p_1-\frac{1}{2})z_1z_2z_4+(p_2-\frac{1}{2})z_1z_3z_4+(p_3-\frac{1}{2})z_2z_3z_4]

By inspecting this formula we can see how the error correction works. Each of the zz values appear in at least two of the three variable strings. So, if only a single bit value is flipped it would end up being energetically more costly to assign a value other than the original one, thus, allowing the original bitstring to be recovered. Moreover, this implies that a single error on one of the parity checks will be overridden by the combination of the other parity check (or checks in the case of z4z_4) and the bit value which was transmitted; thus, the code would not erroneously “correct” a value if a single parity check were corrupted. 

Error correction is commonly used in communication in practice, and can achieve close to the best theoretically allowed behavior. This is possible because of the discovery of a class of codes known as turbo codes, where the decoding problem corresponds to an easy optimization, rather than a hard one. It was later realized that low-density parity-check codes, which were proposed decades earlier had the same property. However, there may still be niche applications where one might have to use a different code for some reason (i.e. - quantum error correction)--one where the correction does correspond to a hard optimization problem. In any case, these codes provide a simple example of multi-body problems which arise in the real world.

Conclusion

Having multibody interactions available at a hardware level has many different and important implications of the types of problems that can be solved. As shown above, multibody interactions can be applied in different and meaningful ways. While multibody interactions can be mapped to two-body interactions, this process will involve at least an extra variable per interaction, and may reduce solving efficiency. A unique aspect of our technology is that we can implement the multibody interactions directly.