January 7, 2021

Euclidean Algorithm and Linear Diophantine Equations

Shivam Singhshivam51

DURATION

15min

INTRODUCTION

In this article, we will be talking about the greatest common divisor (gcd) and algorithms to calculate it. The Euclidean algorithm (or Euclid’s algorithm) is one of the most used and most common mathematical algorithms, and despite its heavy applications, it’s surprisingly easy to understand and implement.

In the simplest form the gcd of two numbers a, b is the largest integer k that divides both a and b without leaving any remainder. We will denote it as gcd(a, b), which is the standard representation. When one of the numbers in the pair is 0, by definition, the gcd is the second number. When both the numbers are 0, the greatest common divisor is undefined but is assumed to be 0.

The trivial algorithm to find the gcd of two numbers will be to loop from 1 to min(a, b) and check for each number if it divides both a and b. The range of gcd(a, b) is [1, min(a, b)] because a number cannot divide another number smaller than it. For the same reason, we loop from 1 to min(a, b).

The pseudocode for the above idea would be:

1
2
3
4
5
6
7
8
9
10
11
12
Let a, b be the numbers whose gcd we are calculating
Assume a <= b
If a == 0
Return b
Endif
gcd = 1
for every i(1 <= i <= a)
If a is divisible by i and b is divisible by i
gcd = i
Endif
Endfor
Return gcd

Another way to find the gcd would be to run the loop from i = a to i = 1 and return gcd at the first occurrence of ‘i’ such that it divides both a and b. In both cases, the time complexity in the worst case for this algorithm remains O(n). The Euclidean algorithm, on the other hand, gives us a way to compute the greatest common divisor of two numbers in O(log(min(a,b))). This article will introduce the Euclidean algorithm to find the gcd and its applications in competitive programming.

SOME OBSERVATIONS WITH GCD

You can observe that the gcd of two numbers does not change if the larger number is replaced by the difference between the two numbers. By doing this repeatedly, in the end, one of the elements will become 0 and the gcd becomes the second element (which is non-zero). With each run one of the numbers decreases, which makes sure the algorithm will terminate eventually.

PROOF OF CORRECTNESS

Let a be pk and b be qk where k is the gcd of a and b and without any loss of generality assume a <= b, i.e., p <= q.
Now the difference b-a = (q-p)*k where q - p >= 0
It can see that the gcd of a and b-a is also k.
On repeating this process further, in the end, we get 0, and the other number is k.

Pseudocode for the above idea:

1
2
3
4
5
6
7
8
gcd(a, b)
if a > b
swap(a, b) // keeping a to be the smaller of the pair
Endif
If a == 0
Return b // nase case
Endif
Return gcd(b - a, a)

This can be represented mathematically as,

1
2
gcd(a, b) = b when, a = 0
gcd(b - a, a) otherwise...Equation 1

TAKING IT FURTHER TO THE MAIN THEOREM

The same algorithm can be improved a little, replacing the larger number with b%a instead of b-a. b%a means b modular a or the remainder when b gets divided by a. The proof of this change is in Equation 1.
Let the gcd of a and b be k. Therefore k|a and k|b where p|q means p divides q.
Now by definition, b%a = b - ⌊b/a⌋*a
Both b and ⌊b/a⌋*a are divisible by k; hence b - ⌊b/a⌋*a is also divisible by k, which makes b%a | k. After this point, we can follow what we did in Equation 1, replacing each subtraction with a mod.

Taking the same example of 14 and 26

14 26 replace 26 with 26%14
14 12 replace 14 with 14%12
2 12 replace 12 with 12%2
2 0

As soon as one of the numbers turns 0, exit, gcd is the second number.
Here is some pseudocode for the above algorithm:

1
2
3
4
5
	gcd(a, b)
	If a == 0
	Return b
	Endif
	Return gcd(b % a, a)

and the iterative code

1
2
3
4
5
6
7
	gcd(a, b)
	while a > 0
	temp = b
	b = a
	a = temp % a
	EndWhile
	Return b

The complexity of this algorithm is log2min(a, b). In reality, the number of the function call is not more than five times the digits in base 10 of the smaller number. We can see the power of logarithm here: in the trivial algorithm defined at the start of this article, the number of times the code runs for reasonably large numbers, assume in the order 10^9, would be ~ 10^9, but using Euclid’s algorithm, we can compute the greatest common divisor in not more than thirty steps.

EXTENDED EUCLIDEAN ALGORITHM

The extended Euclidean algorithm states that for any two positive integers a and b, there always is m and n such that it is possible to represent the gcd of a and b as a * m + b * n.

Therefore,
a * m + b * n = gcd(a, b) for some integer m and n, they can be negative or zero.

While the Euclidean algorithm finds the gcd of two numbers, the extended algorithm also allows us to represent this gcd in terms of these two numbers. The importance of this result is seen more in the next topic, linear diophantine equations.

In this original Euclidean theorem, the operations end when one of the numbers is 0 and the other is g. For these parameters, we can easily find the coefficients m and n, which is

0*0 + g*1 = g … Equation 2

Where 0 is m and 1 is n, instead of 0, m can be taken as an integer, as the equations remain true; it is interesting to see that on changing the values here, we get different final values of m and n. For example if a = 3 and b = 5 then
3*(2) + 5*(-1) = 1 but also,
3*(-3) + 5*(2) = 1. So, changing the value of m and n in the base case gives us different m and n for the original equation.

By reversing the steps in Euclid’s theorem, we can find the coefficients m and n. All we need to do is to figure out how the value of m, n changes from (b%a, a) to (a, b).

Let us assume we know some x0 and y0 such that

1
2
3
4
		ax0 + (b % a) y0 = g
		ax0 + (b - ⌊b / a⌋ * a) y0 = g
		a(x0 - ⌊b / a⌋ y0) + by0 = g
		am + bn = g

Therefore m = x0 - ⌊b/a⌋y0
n = y0

Pseudocode for the above result looks like:

1
2
3
4
5
6
7
8
9
10
		gcd(a, b, n, m) // The equation would look like a*m + b*n = g
		If a == 0
		m = 0, n = 1
		Return b
		EndIf
		xx, yy
		g = gcd(b % a, a, xx, yy)
		x = yy - ⌊b / a⌋ * xx;
		y = xx;
		Return g

LINEAR DIOPHANTINE EQUATIONS

We have three integers, a, b, and c, such that a * x + b * y = c, we are required to find a solution to this linear equation. Linear equations in two variables are known as linear diophantine equations, and the extended Euclidean algorithm helps in solving these problems.

Let us look at the extended Euclidean algorithm equation again, denote g as the gcd(a, b), and x_g, y_g be integers for which:

ax_g + by_g = g is true
Therefore,
axg*(c/g) + byg*(c/g) = c is also true

Therefore the x we require is equal to x_g*(c/g) and y = y_g*(c/g). If c is divisible by gcd(a, b), then the linear equation has one or more solutions; otherwise, it does not. It is easy to conclude that a linear combination of any integer should be divisible by that number.
Therefore one of the solutions to the above linear diophantine equation is:

x = xg*(c/g)
y = yg*(c/g)

Pseudocode for the above is, using the gcd function we calculated earlier:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Find_one_solution(int a, int b, int c, int & m, int & n)
g = gcd(a, b, m, n)
If c % g != 0
return false // if c is not divisible by g, no solution exists
EndIf
m = m * c / g
n = n * c / g
If a < 0
m = -m
EndIf
If b < 0
n = -n
EndIf
Return true

CONCLUSION

Kudos! You completed it. You can now solve any problem that uses Euclidean’s algorithm or is about finding solutions to linear equations. In competitive programming, most of the time, the problems do not come with the straight task of computing the gcd of two numbers. Still, often it is a subtask to solving some DP or greedy or rather any mathematical problems. Diophantine equations are rarely anything but advantageous when there are problems like "In how many ways can <x> and <y> combine with <conditions> and their sum is <z>. " It’s always important to keep the mathematical theorem and algorithms handy when you want to increase ratings.

Chat on Discord