Cryptography, Number Theory and Sports Analytics – An Unlikely Trio

Updated: Jun 27

By Vidyaratnam Ganapathy




Cryptography is the encoding of a message such that the message can be understood only by the sender and the receiver. This method prevents third parties from decoding important messages, and is mainly used in communications, with the encryption coming in the form of a code. Cryptography has also found a niche in sports analytics. Cryptography is founded on ‘Number Theory.’

Number Theory is the study of relationships between numbers. It is essential in encoding and decoding cryptographic messages. Number theory can be expressed in the programming language Python.

Let us very briefly explore a few relationships between numbers:

  1. Greatest Common Divisor: This theorem states that for any integers a, b, and r, there exists an equation:

a=bq+r

Here, q is the quotient, a and b are two positive integers, while 0≤r<b.

If r≠0, the greatest common divisor of a and b is the greatest common divisor of b and r.

For example: gcd 12, 7 =1. The equation for this is 12=7×1+5.

  1. Modulus (Mod) Function: This theorem states that for integers a,r and m:

a≡r(mod m)

This signifies that m divides a-r.

An example of this is 15≡3mod 6. Here, 6 divides 15-3(12).

  1. Fermat’s Little Theorem: This theorem states that for any prime number p, there exists a number a, such that:

ap-1≡1(mod p)

Here, a is not a multiple of p.

An example of this is 72≡1mod 3. 49=3×16+1.

  1. Euler’s Phi Function: This function states that for any number m:

m=#{ai:1≤ai<m, andgcd of ai and m is 1} .

This indicates that there is a set of numbers, where ai and m are co-prime numbers, and ai is less than m.

For example: 12=1, 5, 7, 11.

Similarly: pk=pk-pk-1, where p is a prime number.

An extension of this theorem can be expressed as: nm=Φ(n)×Φ(m).

  1. Chinese Remainder Theorem: This theorem is another way to relate two numbers r and s, where the gcd of r and s is 1.

x=b-cαr+krs+b

Here, b and c are integers, and 1≤x<rs. Also, x≡b(mod r) and x≡c(mod s).

  1. Binary Expansion: This theorem explains the binary expansion of a number. Binary code is written in ‘0’s and ‘1’s.

For example: 0 is written as 00, 1 is written as 01, 2 is written as 10, 3 is written as 11, and so on.

The number 7 can be written as 111. Here, 111 is called the binary expansion of 7. An equation that can be used to find this is:

a0+2a1+22a2+23a3+…+2nan=a

Here, a is any positive integer, with a binary expansion of a0a1a2a3×…×an.

In this equation:

  1. a0=a%2. [This indicates that a0 is given by the remainder when a is divided by 2.]

  2. a1=a2%2. [Here, a2 is an integer.]


  1. An Example of Cryptography

And here is how all that information on numbers is applied in cryptography:

p and q are two large primes that only the sender and the receiver know, and m=(p-1)(q-1).

m=pq, and k is any number such that gcd k,m =1.

The sender sends a message s, such that e=skmod m.

The receiver then solves that xk=emod m.

Using this information, the receiver can find out that x=s. Now, using this, the receiver can find the value of s and decode the message.

This is an example of the RSA cryptosystem, which is a form of asymmetric key encryption.

Types of Encryption:

There are primarily two types of encryption: Symmetric Key and Asymmetric Key encryption.

  1. Symmetric Key Encryption involves the use of the same key for both encryption and decryption. Symmetric key encryption can be broken if others know the key, which makes it vulnerable. One example of Symmetric key encryption is the Advanced Encryption Standard (AES).


  1. Asymmetric Key Encryption involves the use of two keys, a public key to encrypt the plain text and a private key to decrypt the coded text. Asymmetric key encryption is harder to break than symmetric key encryption, but it also takes more time and computing power, due to the magnitude of the numbers involved. One example of Asymmetric key encryption is the RSA cryptosystem.

The RSA Cryptosystem was invented by Ron Rivest, Adi Shamir, and Leonard Adleman in 1977. The name RSA comes from the first letter of the authors’ last names. The keys to encrypt and decrypt are generated by multiplying two large prime numbers. Though the formulae to generate the public and private keys are common knowledge, the difficulty in deriving the public and private keys lies in the time and computing power required to factorize the product of two large prime numbers. The RSA cryptosystem was not the first of its kind to be developed. In 1973, an English mathematician named Clifford Cocks developed a precursor, to be used in British Secret Intelligence communications. This technology was classified until 1997, when it was revealed by Cocks in a lecture about the British secret communications agency’s history of encryption development.

In the RSA system, every participant uses two keys, public and private, to encrypt and decrypt plain text. As the names suggest, a private key is known only to the person, while the public key of a person is known to all participants. The person encrypting the text uses the recipient’s public key to encrypt the plain text. The recipient uses their private key to decrypt the encrypted text. As the recipient’s private key is known only to the recipient, no one other than the recipient can decrypt the message easily.

The RSA cryptosystem and its contemporaries also have a place in the burgeoning field of sports analytics. In the last few years, there has been a growing shift towards the analytics side of sports. Formerly unknown analytics maestros are now becoming more recognized, and their principles are being used to build teams that can contend for championships.

A core tenet of sports analytics is stripping away the bias that is inherently present in humans, allowing for an evaluation of players at face value.

One way to eliminate bias without the benefit of supercomputers and reams of data is through using cryptography to encrypt players’ names. This ensures that players are evaluated only based on their stats.

Let us say that we want to encrypt the name of the player, “Orange Seeds”, when their sports stats are being evaluated. Using the RSA module in a simple Python program, public and private keys are generated. The player’s name, “Orange Seeds”, is then encrypted. The public key used to encrypt the plain text is this:


The plain text, “Orange Seeds”, when encrypted with the public key, looks something like this:


We then use a private key to decrypt the bytes to get the plain text. If the encrypted text is too large to fit in a page or even store efficiently, Base64 encoding is used to create coded text that is smaller than encrypted text. This is space friendly, but is also easy to reverse, and does not provide much in the way of security.

For example, encoding the text “Orange Seeds” via Base64 results in the encoded text which might look something like:


Using a Base64 decoder, we can get the plain text from the encoded text.

Thus, using cryptography, we can strip out the bias that is inherent in humans, allowing for a more objective evaluation of players’ talents, and maybe even learn a thing or two about coding, a very interesting subject in its own right!



References:

  1. Kerr, Matt. Number Theory and Cryptography. https://www.math.wustl.edu/~matkerr/NTCbook.pdf. Retrieved: 20 February 2021.

  2. Rivest, Ron L., Shamir, Adi, and Adleman, Leonard (1978). A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Retrieved from: https://people.csail.mit.edu/rivest/Rsapaper.pdf

  3. Cryptography 3.4.6 https://pypi.org/project/cryptography/

  4. Cryptosystems. https://www.tutorialspoint.com/cryptography/cryptosystems.htm. Retrieved: 20 February 2021