Introduction to Cryptography 1: One Time Pad [WIP]

Introduction

Cryptography is the practice of securely communicating messages between two parties in presence of a third party, an adversary. An adversary should learn nothing about the plaintext nor should they be able to modify it. This article introduces Symmetric Encryption. A simple scheme where plaintext messages are encrypted using a single key. Symmetric encryption might not necessarily be the most secure scheme but a good starting point to understand cryptography. Let us assume that the two parties before agree to a secret key shared between them. There is also a setting where two parties do not agree upon a key beforehand called Asymmetric Cryptography. I will get into in my future blogposts. Asymmetric cryptography allows two parties to send messages without having to agree upon a common key beforehand.

Any crypto-graphical algorithm has three components 1) Key/Key Generation 2) Encryption Algorithm 3) Decryption Algorithm.

c = E(k,m)
m = D(k, c)
D(k, E(k, m) ) = m

A cryptographical algorithm is said to be correct if a given CT produced by a key and message can be decrypted by using the same key.

Let us consider the two most common ways people know to do cryptography 1) Substitution Table and 2) Caesar Cipher.

Ceaser Cipher

Every letter is shifted by some offset. For an offset of k=2. A becomes C , D becomes F and so on. While it seems secure since the plaintext is know obfuscated in some form. There are two major problems with it. Actually several more before it becomes a useable cipher, but let me list two major problems: The Cipher-text reveals a lot about plaintext Frequency Analysis attack Similar to Ceaser Cipher is substitution table where the every row of table is replaced with some other character. It also faces the same attacks as Caesar cipher. That is you can look at when characters repeat

One Time Pad


CT=E(Key,msg)=msg^Key
D(Key,CT)=msg=CT^Key

One time pad is a basic cryptographic technique where a message and key are xored to create a cipher text. But, OTP allows us to encrypt a given message only once. That is the same key cannot be used to encrypt many messages. Suppose two messages are encrypted with same key, the


CT1=msg1^Key
CT2=msg2^Key
CT1^CT2=((msg1)(Key))^((Key)(msg2))=(msg1)(msg2)


Maybe we could use this scheme if we have a method to generate several keys to encrypt large messages. A good scheme to generate several keys could be attained by a Random Number Generator (RNG). OTP is called a stream cipher since every

What is a Secure Cipher?

Semantic security, cipher text should not be able to give any information about plaintext. Given cipher text c1 and c2, an adversary shouldn’t be able to distinguish if c1 or c2 is the encryption of message m1.
Formally speaking

Pr[A(c1==m1)]=Pr[A(c2==m1)]

Pseudo Random Number Generator (PRNG)

A good random number generator is a generator which generates numbers that look as random as possible. That is an adversary shouldn’t be able to predict any sequence in any key. There are quantitative measures of defining a good PRNG called advantage function. An Advantage function determines whether a given a function mets some criteria to pass a truly random function. A simple example of advantage function, a PRNG having more than 4 consecutive 0’s or 1’s is not truly random or the number of consecutive 0’s and 1’s should be less than 60% for a string to be truly random. A single advantage function is not used to quantify PRNG security but a variety of advantage functions. You can learn more about these tests here.

A good metric to rate a PRNG is:

|Pr[Adv(PRNG Str)==1]-Pr[Adv(Truly RAND Str)==1]|


Informally, The probability of advantage function passing a truly random function should be close to the probability of advantage function passing a PRNG. Above criteria must be close to 0.

OTP for Many time pad

Now that we have an established way of generating new keys using PRNG’s. We can split message block into fixed sizes, generate a key using PRNG for each block and encrypt it. But, this would also result into a key as long as the message length. A good Cipher depends on Good Random Number Generator Since the above scheme depends on

2021

Introduction to Weakly Supervised Learning

3 minute read

Supervised Machine Learning relies on labelled data that consists of data and pairs of expected outputs. For example an image of dog that is labelled a dog. ...

Meta Learning with MAML

3 minute read

Training neural networks for a single task requires several thousands of examples for a each class when training a model from scratch. This is typically not ...

Analyze Private datasets using Pandas

6 minute read

Conventionally pandas allows you to analyze datasets that are present locally on your PC, that is when you are given access to a given dataset. But, there a...

Back to top ↑

2020

Deep Learning in Practice-Be The algorithm

6 minute read

Conventional machine learning required the practitioner to manually look at images/text and handcraft appropriate features. Deep Learning models are powerful...

Back to top ↑

2019

Differential Privacy Part-II: DP Mechanisms

6 minute read

Having gone through the importance of differential privacy and its definition, this article motivates the theory with a practical example to make it more int...

Differential Privacy Part-I: Introduction

6 minute read

Personal data is a personal valuable asset, it could be used for economic, social or even malicious benifits. Most internet companies survive on personal dat...

Back to top ↑