My challenge was to complete a shift cipher challenge that can encode and decode simple phrases. My code passes their test script, as well as some other important cases not included in their test script.
Shift ciphers are also called Caesar ciphers as they were used by Julius Caesar to scramble his messages to military personnel, so that they were purposefully unintelligible to any unintended reader.
Using a shift cipher is a simple and well-known (albeit usually insecure!) encryption technique, and a suitable topic for the focus of our challenge. In a simple shift cipher, a given input phrase is shifted by a certain number of characters so that it is no longer legible.
For example, when using a shift of 1, the letter a would be shifted to b.
When using a shift of 1 (default):
cipher = Cipher()
# Encode:
cipher.encode('foobar') # ---> 'gppcbs'
# Decode:
# We can simply reverse the shift
cipher.decode('gppcbs') # ---> 'foobar'
When using a shift of 10:
cipher = Cipher()
# Encode:
cipher.encode('foobar', 10) # ---> 'pyylkb'
# Decode:
cipher.decode('pyylkb', 10) # ---> 'foobar'
Our implementation of the shift cipher is a simple one:
- For encoding, the input text should be shifted by n characters, with a default shift of 1
- For decoding, encoded text should be shifted back and return the original input text.
- When a shift causes a letter to pass the end of the alphabet, it should wrap to the beginning.
Example: z shifted by 1 should become a.
The simple_cipher.py
script is what calculates the shift cipher.
At a high level, the script loops through the plaintext
, and based on the
character char
on that iteration, whether it is encoding
or decoding
, and
the shift_distance
, performs a calculation to do the shift, and then appends
the shifted character to a list. At the end of the loop, the join of the list
is returned.
-
Because it stores the characters in a list before returning the join, the space complexity of the algorithm is O(N)
-
Because it loops through each
char
inplaintext
, the time complexity of the algorithm is also O(N). The join() operation is also O(N), but since it is outside the loop, this makes the pipeline effectively still O(N).
The algorithm leverages the relationship between letter characters in unicode, and the circulatory of the shift rule through the use of modulo.
-
Lowercase letters (
a
throughz
) convert to 97 through 122 in unicode. For encoding, the algorithmchr((ord(char) + shift_distance - 97) % 26 + 97)
takes the unicode valueord()
of a given lowercase letter characterchar
, adds theshift_distance
, and subtracts 97, such that with ashift_distance
of 0, the minimum value of lowercase letters would be 0, corresponding toa
. The modulo 26%
gives us the division remainder, such that forchar = a
andshift_distance = 1
, we would have:97 + 1 - 97 = 1
1 % 26 = 1
1 + 97 = 98
Which is then converted back to characterchr()
aschr(98) = b
.
-
With a
shift_distance
of 0, this calculation is still performed, but this will only return the original character. -
Because of this use of modulo, if the
shift_distance
would exceed the alphabet (e.g.z
withshift_distance = 1
), it will instead give us a remainder value less than or equal to 26, which in this example would be122 + 1 - 97 = 26
26 % 26 = 0
0 + 97 = 97
chr(97) = a
-
For uppercase letters (
A
throughZ
), the values range from 65 through 90, and the value prior to modulo is normalized by subtracting 65. However, after the modulo, 97 is added, returning a lowercase letter. In this case, even with ashift_distance
of 0, a lowercase version of the letter will be returned. -
encode
anddecode
are effectively the inverse of each other. In encoding, theshift_distance
is added, and in decoding, theshift_distance
is subtracted, such that encoding with a negativeshift_distance
is the same as decoding with the absolute value of thatshift_distance
, and the reverse is also true.
In regards to running the code, the script can be read as follows:
Cipher
is a class with the methodsencode
anddecode
, making it compatible with the provided test scriptsimple_cipher_test.py
.
The methods take two inputs, plaintext
and shift_distance
.
-
The first input is
plaintext
, the string to be encoded/decoded. If no inputs are provided, an exception will be raised. Ifplaintext
is empty, contains any spaces, or any characters besides letters, this will also raise an exception. -
shift_distance
is the number of times to shift each character. It is an optional input, and defaults to 1. A shift_distance of 0 will run the calculation, but this will only change uppercase characters to lowercase, lowercase characters will not be changed.
Fork this repository to your account and clone it, or download it as a zip and extract the files.
- Open a terminal window and navigate to the challenge directory.
- Run
pip install pytest
. - Run
pytest simple_cipher_test.py
to test the challenge.
- Ensure you have python installed (
python -v
) - Ensure you have pytest installed (
pip install pytest
) - Ensure you are in the challenge directory
The Cyndx platform harnesses the power of semantic search which is driven by our proprietary predictive analytics engine. We make data smarter, so that it can work harder and more effective for you, which ultimately allow you to find the right investments or investors to satisfy your needs, every time.
Learn more: https://www.cyndx.com
Simplified and adapted from the Exercism (https://exercism.io/) cipher exercise.