GCD Part 2

In the first post on greatest common divisor we saw that (a,b) is the largest positive integer d such that d|a and d|b. There are in fact multiple ways of characterizing the GCD of two integers. In this post I will present another way of looking at GCD. This characterization says that (a,b) is the least positive integer that is a linear combination of a and b, i.e. a number ma+nb for m,n\in \Z. The proof uses the well-ordering property which says that every non-empty subset of positive integers has a least element (this can be taken as an axiom of the integers). The proof will also use the division algorithm and facts about divisibility. Here is the theorem:

Theorem: The greatest common divisor of the integers a and b, not both zero, is the least positive integer that is a linear combination of a and b.

Proof: Suppose d is the least positive integer such that d=am+bn for some integers m,n. We know that the set of the set of positive integers am+bn is nonempty because we may take either m=1 or m=-1, so our d exists by the well-ordering property. By the division algorithm we have that a=dq+r for some integers q,r where 0\leq r < d, and so

    \[ r=a-dq=a-(am+bn)q=a-amq-bnq=(1-mq)a-bnq. \]

So r is a linear combination of a and b but r<d. If r>0 this would contradict our assumption that d is the least such positive integer, so it must be that case that r=0. So in fact a=dq, so d|a. Similarly we may show that d|b. This shows d is a common divisor of a and b. It remains to show it is the greatest common divisor. Suppose e>d and e|a and e|b. Then e|am+bn so e|d, but this implies e\leq d a contradiction. So d is indeed the greatest common divisor.

In sage the xgcd(a,b) command gives a tuple with three entries: (d,m,n) where d=am+bn as in the proof above.

Sieve of Eratosthenes

The Sieve of Eratosthenes is the classic example of a number sieve. But before explaining this algorithm, here is a particularly elegant description of sieve theory by Dr. Terrance Tao(blog):

The idea is to try to uncover the primes as a set in aggregate. The primes you can think of as some sort of sculpture buried inside the integers. So the natural numbers: 1,2,3,4,5… you can think of as some sort of block of granite and you’re trying to carve out the primes out of this block of granite. So what you do is you take this big mallet and you’re gonna wack off big chunks off of this granite sculpture and then after a while you take away your mallet and you take a chisel and you chisel off some smaller pieces and then you put away your chisel and you take a needle and make little tiny adjustments, and then eventually you get the primes.

The sieve of Eratosthenes is an algorithm for finding all of the prime numbers up to a given positive integer N. Here is the algorithm:

  1. List all numbers 2\leq n \leq N (one is not a prime)
  2. Starting with the first element of the list, 2, remove all multiples of 2 from the list.
  3. Move to the next element of (modified) list and remove all multiples of that number from the list.
  4. Repeat this process until all multiples of \floor{\sqrt{N}} are removed from the list.
  5. The final list contains all the primes less than N.

The reason we only need to check up to \floor{\sqrt{N}}, follows from the fact that every number up to and including N has prime factors not exceeding \sqrt{N}. For suppose N were composite, then N=ab for 1<a\leq b<n. If a>\sqrt{n} then ab>\sqrt{n}\sqrt{n}=n, a contradiction. Thus we only need to remove numbers from the list with divisors (are multiples of) numbers up to \sqrt{N}.

What makes this idea so good is that each time the list gets smaller and so on each iteration you need to check less and less numbers, analogous to Dr. Tao’s description of sculpting a block of granite. On the first iteration you hit the block with a large mallet by removing all multiples of two, half of the numbers in the list. Subsequent iterations involve knocking off relatively smaller chunks of numbers.

Now for some code.

Proof of the prime factorization

The fundamental theorem of arithmetic states that every integer greater than 1 is either a prime number or is the product of prime numbers and this factorization is unique. The strong principle of mathematical induction makes quick work of the prime factorization part, but uniqueness takes a bit more work. First we must consider two base cases, 2 and 3. Well these are both prime so the proposition holds. Now we are using strong induction so for the inductive step we assume that for some positive integer n, every integer 1<k<n has a unique prime factorization. Now if n is prime we are done. If not then n=ab for a,b<n. Thus each of a and b are a product of primes, and so ab is a product of primes, as desired.

Here is some sage code that gives the prime factorization of a given integer n.

Greatest Common Divisor

It is a fact, called the fundamental theorem of arithmetic, that every integer can be factored into a product of prime numbers. This can be written mathematically as

    \[ n = \prod_{i=1}^k p_i, \]

for any integer n and where the p_i are prime. Clearly p_i|n for each i. Some primes may be repeated, so we can in fact write n as the product of prime powers but I will not do this here. With this notation we can define a divisor d of n as

    \[ d = \prod_{\alpha\in I} p_\alpha \]

where I\subset \{1,2,\ldots, k\}. In other words, if d is some product of primes that are in the prime factorization of n. If d is such a number then d|n. We can subsequently define the greatest common divisor of two integers m and n as

    \[ \gcd(a,b) = \prod_{\alpha\in I} p_\alpha \]

where I=\{\alpha \text{ such that } p_\alpha|m \text{ and } p_\alpha|n  \}. We can also write \gcd(a,b)=(a,b). If (a,b)=1 we say that a and b are coprime. From these definitions follow

Proposition: For two integers a,b if (a,b)=d then (\frac{a}{d},\frac{b}{d})=1.

It is very easy to find the GCD of two numbers with sage.

Fun with Goldbach Partitions

Any positive integer can be written as the sum of positive integers. For example the number 4 can be written as

    \begin{align*} 4&= 4 \\ &= 3+1 \\ &= 2+2 \\ &= 2+1+1 \\ &= 1+1+1+1 \\ \end{align*}

Each of these five distinct ways of writing the number is called a partition. Note that the number itself is indeed a partition. Also the order does not matter (i.e. 3+1 is the same partition as 1+3). Lets try a few more numbers. We can write 18=7+11 and 21=3+5+13. Note that each of the numbers in the previous two partitions are prime. In fact it was conjectured in 1742 by German mathematician Christian Goldbach that every odd number greater than 7 can be written as the sum of three odd primes, and that every even integer greater than 4 can be written as the sum of two odd primes. The second statement implies the first, for if 2n+1 is a sufficiently large odd integer, then 2n+1 = 2(n-1)+3 = p+q+3, where p,q are two odd primes that partition the even number 2(n-1). The first statement is known as the little Goldbach conjecture, while the second is just the Goldbach conjecture. The Goldbach conjecture is one of the oldest unsolved problems in mathematics. No proof of is known and neither has anyone found a counterexample.

This is a simple program that prints out the binary (two) Goldbach partitions for an even number and the ternary (three) Goldbach partitions for an odd integer

If you test enough numbers you might notice a pattern. For example, the number 100 has 6 Goldbach partitions while the number 101 has 37 Goldbach partitions. For bigger numbers the discrepancy is even more striking, for 900 has 48 Goldbach partitions while 901 has 917! It seems as though there are many more ternary Goldbach partitions for an odd number than there are binary Goldbach partitions for an even number (as long as these numbers are close together). The following program illustrates this with a plot of G(n), the Goldbach partition counting function:

How to Attack a Cryptosystem

Starting Point

The general starting point for cryptography is to consider a person Alice who wants to send a secure message to another person Bob. First Alice writes out her message, called the plaintext. Then she encrypts the message to produce the cyphertext. She send this to Bob who subsequently decrypts it to reveal the plaintext. We may also suppose the existence of an eavesdropper named Eve. She may have various goals such as decrypting a single encrypted message, or to discover the encryption/decryption key in order to read all future messages. Maybe Eve wants to communicate with Bob as if she were Alice.

Possible Attacks

There are four main ways Eve might be able to attack the secure communication system between Alice and Bob.

  1. Cyphertext-only: Here Eve only has access to cyphertext that she intercepted.
  2. Known-plaintext: Here Eve has access to cyphertext as well as the corresponding plaintext, called the crib.
  3. Chosen-cyphertext: Eve is able to create an arbitrary string of cyphertext and decrypt using an unknown decryption key it and find the corresponding plaintext.
  4. Chosen-plaintext: Where Eve can make an arbitrary string of plaintext and encrypt it with an unknown encryption key to find the cyphertext.

Classical cyphers were vulnerable to these types of attacks but more modern cyphers attempt to be resistant to all of them. I will refer back to this list in future posts on specific cyphers.

Two’s Complement

Two’s complement is a way of writing signed integers, that is both positive and negative, using binary digits. This notation simplifies computer arithmetic as we will see. For an integer x such that -2^{n-1}\leq x \leq 2^{n-1}-1, its two’s complement is written using n bits as follows. If x is positive the leftmost bit is 0 and the remaining n-1 bits are the binary expansion of x. If x is negative the leftmost bit is 1 and the remaining n-1 bits are the binary expansion of 2^{n-1}-|x|.

Example: 22=010110 and -19=101101.

Here is some sage code to test this out:

Proposition: If m is an integer with two's complement representation (a_{n-1}a_{n-2}\cdots a_1a_0), then

    \[ m = -a_{n-1}2^{n-1} + \sum_{i=0}^{n-2}a_i2^i. \]

Proof: If a_{n-1}=1 then m<0, so

    \begin{align*} -a_{n-1}2^{n-1} + \sum_{i=0}^{n-2}a_i2^i &= -2^{n-1} + (2^{n-1}-|m|) \\ &=m \end{align*}

Otherwise we have trivially that \sum_{i=0}^{n-2}a_i2^i = m. \hfill\blacksquare

Proposition: Given the twos's complement representation of an integer m, to find the two's complement representation of -m just flip the bits and add 1. In other words, Twos(-m)= (\text{NOT } Twos(m))+1.
Proof: Clearly we must flip the first bit to change the sign of the number. If m is positive then we have

    \[ 2^{n-1}-1 = m + (2^{n-1}-1-m). \]

One way of interpreting this equation is that if m's two's complement representation has n bits, then the first is zero, and the remaining n-1 bits are either 0 or 1. The 1 bits are what constitute m. If we take all the remaining 0 bits and flip them on then we get 2^{n-1}-1. Thus if we flip the bits in the two's complement of m then we will get 2^{n-1}-1-m, which is one less than what we want. So just add one and we get the two's complement of -m. Now if m<0 then we have the equation

    \[ 2^{n-1}-1 = (2^{n-1}-|m|)+(|m|-1). \]

Using the same logic as above we see that the method still works. \hfill\blacksquare

Adding Integers with Arbitrary Bases

I have written some code to add two numbers in arbitrary bases with arbitrary digits. Scroll to the bottom to perform your own additions.

Working with common number bases

For more information on integer representation read this post.

The most common bases to work with in computing are 2 (binary), 8 (octal), 10 (decimal), and 16 (hexadecimal). In particular it is very easy to convert between 2 and 8 or 2 and 16.

Recall that an octal number is a string consisting of the digits 01234567. A number in octal is written as such:

    \begin{align*} (d_{k}\cdots d_1d_0)_8 &= d_{k}\times 8^k + \cdots + d_1\times 8^1 + d_0\times 8^0 \\ &= d_k \times (2^3)^k + \cdots + d_1\times 2^3 + d_0 \end{align*}

Now each octal digit consists of at most three bits (binary digits), so we make the following conversion table:

Octal Digits 0 1 2 3 4 5 6 7
Binary Digits 000 001 010 011 100 101 110 111

So to convert an octal number to binary just replace each octal digit with the three corresponding binary digits. Conversion between binary and hexadecimal is similar except each hex digit corresponds to four bits. Here is the conversion table:

Hexadecimal Digits 0 1 2 3 4 5 6 7
Binary Digits 0000 0001 0010 0011 0100 0101 0110 0111
Hexadecimal Digits 8 9 A B C D E F
Binary Digits 1000 1001 1010 1011 1100 1101 1110 1111

In general, to convert between base-r and base-r^n, make a conversion table where every base-r^n digit is represented as an n-digit base-r number. Then substitute the numbers in the table to make the desired conversion.

Converting between various number bases is easy in sagemath:



SageMath is an open-source mathematics software package that has many cool features. I like SageMath because it is all open-source. In fact it is an amalgam of many other open-source software packages. One feature is called SageMathCell and it allows for embedding a code box directly into a website which can then be executed and the output displayed. This is just what I have been looking for. I look forward to using this extensively in my future blog posts.

Here is an example:

Here is another:

When the “Evaluate” button is pressed, the code is sent to the SageMath servers and executed, and the output is sent back to the website. It is possible to set up your own server, which will be a fun project for me in the next few months.