Translate

Monday, September 24, 2012

DATA WAREHOUSING AND MINIG ENGINEERING LECTURE NOTES--Classical Encryption


CONVENTIONAL ENCRYPTION

         referred  conventional / private-key  / single-key

         sender and recipient share a common key

         all classical encryption algorithms are private-key

         was only type prior to invention of public-key in 1970’plaintext - the original message

Some basic terminologies used :

         ciphertext - the coded message

         cipher - algorithm for transforming plaintext to ciphertext

         key - info used in cipher known only to sender/receiver

         encipher (encrypt) - converting plaintext to ciphertext

         decipher (decrypt) - recovering ciphertext from plaintext

         cryptography - study of encryption principles/methods

         cryptanalysis (codebreaking) - the study of principles/ methods of deciphering ciphertext without knowing key

         cryptology - the field of both cryptography and cryptanalysis


 

Here the original message, referred to as plaintext, is converted into apparently random nonsense, referred to as cipher text. The encryption process consists of an algorithm and a key. The key is a value independent of the plaintext. Changing the key changes the output of the algorithm. Once the cipher text is produced, it may be transmitted. Upon reception, the cipher text can be transformed back to the original plaintext by using a decryption algorithm and the same key that was used for encryption.

The security depends on several factors. First, the encryption algorithm must be powerful enough that it is impractical to decrypt a message on the basis of cipher text alone. Beyond that, the security depends on the secrecy of the key, not the secrecy of the algorithm.

         Two requirements for secure use of symmetric encryption:

        a strong encryption algorithm

        a secret key known only to sender / receiver

                                    Y = EK(X)

                                    X = DK(Y)

         assume encryption algorithm is known

         implies a secure channel to distribute key

Figure: conventional cryptosystem
 
 

 

 

 

 

 

 

 

 
A source produces a message in plaintext, X = [X1, X2, … , XM] where M are the number of letters in the message. A key of the form K = [K1, K2, …, KJ] is generated. If the key is generated at the source, then it must be provided to the destination by means of some secure channel.

With the message X and the encryption key K as input, the encryption algorithm forms the cipher text Y = [Y1, Y2, …, YN]. This can be expressed as

Y = EK(X)

The intended receiver, in possession of the key, is able to invert the transformation:

                                                X = DK(Y)

An opponent, observing Y but not having access to K or X, may attempt to recover X or K or both. It is assumed that the opponent knows the encryption and decryption algorithms. If the opponent is interested in only this particular message, then the focus of effort is to recover X by generating a plaintext estimate. Often if the opponent is interested in being able to read future messages as well, in which case an attempt is made to recover K by generating an estimate.

 

Cryptography

Cryptographic systems are generally classified along 3 independent dimensions:

  • Type of operations used for transforming plain text to cipher text

All the encryption algorithms are abased on two general principles: substitution, in which each element in the plaintext is mapped into another element, and transposition, in which elements in the plaintext are rearranged.

  • The number of keys used

If the sender and receiver uses same key then it is said to be symmetric key (or) single key (or) conventional encryption.

 If the sender and receiver use different keys then it is said to be public key encryption.

  • The way in which the plain text is processed

A block cipher processes the input and block of elements at a time, producing output block for each input block.

A stream cipher processes the input elements continuously, producing output element one at a time, as it goes along.

Cryptanalysis

The process of attempting to discover X or K or both is known as cryptanalysis. The strategy used by the cryptanalysis depends on the nature of the encryption scheme and the information available to the cryptanalyst.

There are various types of cryptanalytic attacks based on the amount of information known to the cryptanalyst.

  • Cipher text only – A copy of cipher text alone is known to the cryptanalyst.
  • Known plaintext – The cryptanalyst has a copy of the cipher text and the corresponding plaintext.
  • Chosen plaintext – The cryptanalysts gains temporary access to the encryption machine. They cannot open it to find the key, however; they can encrypt a large number of suitably chosen plaintexts and try to use the resulting cipher texts to deduce the key.
  • Chosen cipher text – The cryptanalyst obtains temporary access to the decryption machine, uses it to decrypt several string of symbols, and tries to use the results to deduce the key.

 

STEGANOGRAPHY

A plaintext message may be hidden in any one of the two ways. The methods of steganography conceal the existence of the message, whereas the methods of cryptography render the message unintelligible to outsiders by various transformations of the text.

A simple form of steganography, but one that is time consuming to construct is one in which an arrangement of words or letters within an apparently innocuous text spells out the real message.

e.g., (i) the sequence of first letters of each word of the overall message spells out the real (hidden) message.

(ii) Subset of the words of the overall message is used to convey the hidden message.

Various other techniques have been used historically, some of them are

  • Character marking – selected letters of printed or typewritten text are overwritten in pencil. The marks are ordinarily not visible unless the paper is held to an angle to bright light.
  • Invisible ink – a number of substances can be used for writing but leave no visible trace until heat or some chemical is applied to the paper.
  • Pin punctures – small pin punctures on selected letters are ordinarily not visible unless the paper is held in front of the light.
  • Typewritten correction ribbon – used between the lines typed with a black ribbon, the results of typing with the correction tape are visible only under a strong light.

Drawbacks of steganography

  • Requires a lot of overhead to hide a relatively few bits of information.
  • Once the system is discovered, it becomes virtually worthless.

 

CLASSICAL ENCRYPTION TECHNIQUES

There are two basic building blocks of all encryption techniques: substitution and transposition.

I .SUBSTITUTION TECHNIQUES

A substitution technique is one in which the letters of plaintext are replaced by other letters or by numbers or symbols. If the plaintext is viewed as a sequence of bits, then substitution involves replacing plaintext bit patterns with cipher text bit patterns.

 

 (i)Caesar cipher (or) shift cipher

The earliest known use of a substitution cipher and the simplest was by Julius Caesar. The Caesar cipher involves replacing each letter of the alphabet with the letter standing 3 places further down the alphabet.

 

e.g., plain text : pay more money

      Cipher text: SDB PRUH PRQHB

Note that the alphabet is wrapped around, so that letter following ‘z’ is ‘a’.

For each plaintext letter p, substitute the cipher text letter c such that

                        C = E(p) = (p+3) mod 26

A shift may be any amount, so that general Caesar algorithm is

                        C = E (p) = (p+k) mod 26

Where k takes on a value in the range 1 to 25. The decryption algorithm is simply

 

                        P = D(C) = (C-k) mod 26

 

(ii)Playfair cipher

The best known multiple letter encryption cipher is the playfair, which treats digrams in the plaintext as single units and translates these units into cipher text digrams.

 The playfair algorithm is based on the use of 5x5 matrix of letters constructed using a keyword. Let the keyword be ‘monarchy’. The matrix is constructed by filling in the letters of the keyword (minus duplicates) from left to right and from top to bottom, and then filling in the remainder of the matrix with the remaining letters in alphabetical order. The letter ‘i’ and ‘j’ count as one letter. Plaintext is encrypted two letters at a time according to the following rules:

  • Repeating plaintext letters that would fall in the same pair are separated with a filler letter such as ‘x’.
  • Plaintext letters that fall in the same row of the matrix are each replaced by the letter to the right, with the first element of the row following the last.
  • Plaintext letters that fall in the same column are replaced by the letter beneath, with the top element of the column following the last.
  • Otherwise, each plaintext letter is replaced by the letter that lies in its own row and the column occupied by the other plaintext letter.

 

M
 
O
N
A
R
C
H
Y
B
D
E
F
G
I/J
K
L
P
Q
S
T
U
V
W
X
Z

 

Plaintext = meet me at the school house

 

Splitting two letters as a unit => me  et   me  at   th  es   ch   ox  ol   ho  us  ex

Corresponding cipher text     => CL KL CL RS PD IL HY AV MP HF XL IU

 

Strength of playfair cipher

  • Playfair cipher is a great advance over simple mono alphabetic ciphers.
  • Since there are 26 letters, 26x26 = 676 diagrams are possible, so identification of individual digram is more difficult.
  • Frequency analysis is much more difficult.

 

 

 

(iii)Polyalphabetic ciphers

Another way to improve on the simple monoalphabetic technique is to use different monoalphabetic substitutions as one proceeds through the plaintext message. The general name for this approach is polyalphabetic cipher. All the techniques have the following features in common.

  • A set of related monoalphabetic substitution rules are used
  • A key determines which particular rule is chosen for a given transformation.

 

(iv)Vigenere cipher

In this scheme, the set of related monoalphabetic substitution rules consisting of 26 caesar ciphers with shifts of 0 through 25. Each cipher is denoted by a key letter. e.g., Caesar cipher with a shift of 3 is denoted by the key value 'd’ (since a=0, b=1, c=2 and so on). To aid in understanding the scheme, a matrix known as vigenere tableau is constructed.

 
PLAIN TEXT
K
E
Y
 L
E
T
T
E
R
S
 
 
a
b
c
d
e
f
g
h
i
j
k
x
y
z
a
A
B
C
D
E
F
G
H
I
J
K
X
Y
Z
b
B
C
D
E
F
G
H
I
J
K
L
Y
Z
A
c
C
D
E
F
G
H
I
J
K
L
M
Z
A
B
d
D
E
F
G
H
I
J
K
L
M
N
A
B
C
e
E
F
G
H
I
J
K
L
M
N
O
B
C
D
f
F
G
H
I
J
K
L
M
N
O
P
C
D
E
g
G
H
I
J
K
L
M
N
O
P
Q
D
E
F
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
:
x
X
Y
Z
A
B
C
D
E
F
G
H
 
 
W
y
Y
Z
A
B
C
D
E
F
G
H
I
 
 
X
z
Z
A
B
C
D
E
F
G
H
I
J
 
 
Y

 

Each of the 26 ciphers is laid out horizontally, with the key letter for each cipher to its left. A normal alphabet for the plaintext runs across the top. The process of encryption is simple: Given a key letter X and a plaintext letter y, the cipher text is at the intersection of the row labeled x and the column labeled y; in this case, the cipher text is V.

To encrypt a message, a key is needed that is as long as the message. Usually, the key is a repeating keyword.

e.g.,     key      = d e c e p t i v e d e c e p t i  v e d e c e p t i v e

            PT        = w e a r e d i s c o v e r e d s a v e y o u r s e l f

            CT       = ZICVTWQNGRZGVTWAVZHCQYGLMGJ

Decryption is equally simple. The key letter again identifies the row. The position of the cipher text letter in that row determines the column, and the plaintext letter is at the top of that column.

 

Strength of Vigenere cipher

  • There are multiple ciphertext letters for each plaintext letter.
  • Letter frequency information is obscured.

 

One Time Pad Cipher

It is an unbreakable cryptosystem. It represents the message as a sequence of 0s and 1s. this can be accomplished by writing all numbers in binary, for example, or by using ASCII. The key is a random sequence of 0’s and 1’s of same length as the message. Once a key is used, it is discarded and never used again. The system can be expressed as follows:

Ci = Pi Ki

Ci  - ith binary digit of cipher text

Pi - ith binary digit of plaintext

Ki - ith binary digit of key

·         – exclusive OR opearaiton

Thus the cipher text is generated by performing the bitwise XOR of the plaintext and the key. Decryption uses the same key.  Because of the properties of XOR, decryption simply involves the same bitwise operation:

Pi = Ci Ki

e.g.,     plaintext  = 0 0 1 0 1 0 0 1

            Key         = 1 0 1 0 1 1 0 0

                             -------------------

            ciphertext = 1 0 0 0 0 1 0 1

 

Advantage:

  • Encryption method is completely unbreakable for a ciphertext only attack.

Disadvantages

  • It requires a very long key which is expensive to produce and expensive to transmit.
  • Once a key is used, it is dangerous to reuse it for a second message; any knowledge on the first message would give knowledge of the second.

 

 

 


 

II .TRANSPOSITION TECHNIQUES

All the techniques examined so far involve the substitution of a cipher text symbol for a plaintext symbol. A very different kind of mapping is achieved by performing some sort of permutation on the plaintext letters. This technique is referred to as a transposition cipher.

Rail fence is simplest of such cipher, in which the plaintext is written down as a sequence of diagonals and then read off as a sequence of rows.

Plaintext          = meet at the school house

To encipher this message with a rail fence of depth 2, we write the message as follows:

m   e   a   t   e   c   o   l   o   s

   e   t    t    h   s   h   o   h   u   e

The encrypted message is

MEATECOLOSETTHSHOHUE

Row Transposition Ciphers-A more complex scheme is to write the message in a rectangle, row by row, and read the message off, column by column, but permute the order of the columns. The order of columns then becomes the key of the algorithm.

e.g.,                 plaintext = meet at the school house

                        Key = 4       3        1      2     5       6      7

                        PT   = m      e        e       t       a      t       t

                                    h       e        s       c      h      o      o

                                     l       h        o       u      s      e

                        CT   = ESOTCUEEHMHLAHSTOETO

A pure transposition cipher is easily recognized because it has the same letter frequencies as the original plaintext. The transposition cipher can be made significantly more secure by performing more than one stage of transposition. The result is more complex permutation that is not easily reconstructed.


 

1 comment:

  1. The above notes gave me a clear idea about classical encryption process. It covers all the basic detail needed to well understand about this concept. I highly recommend all the viewers to use this detail.
    eSignature

    ReplyDelete