Assignment 2: a file encryptor
Aims
- to improve your understanding of filesystem objects
- to give you experience writing C code to manipulate binary files
- to further experience practical uses of bitwise operations
- to give you experience writing a relevant low-level data manipulation program in C
- to improve your understanding of filesystem objects
- to give you experience writing C code to manipulate binary files
- to further experience practical uses of bitwise operations
- to give you experience writing a relevant low-level data manipulation program in C
Introduction
Your task in this assignment is to write tide, a terribly insecure single-file encryption/decryption tool. Throughout this assignment, you will explore some basic filesystem operations, as well as implement several rudimentary encryption algorithms.
Encryption is the process of converting information into an obscured format, which can (in theory), only be converted back into useful information by an authorized party who knows the encryption process and key. Encryption is an incredibly useful tool, and is the reason why the internet can function in the way it does, with sensitive information freely transmitted across it.
File encryption is particularly useful to safeguard data in the case that it is stolen. Encrypting your files could prevent someone from being able to access your photos in the event that your laptop gets stolen.
In this assignment, you will implement three different algorithms for file encryption: XOR (eXclusive OR), ECB (Electronic Code Book) and CBC (Cipher Block Chaining). Each of these algorithms function slightly differently, but all work towards the same purpose of obscuring information, that can only be correctly interpreted by an authorised party.
XOR encryption works by employing the bitwise XOR operation on every bit of some given data. A key, which when broken up into it's constituent bits, expanded to much the length of the data being encrypted. The XOR operation is then employed between these two bitstreams to yield the encrypted data. This encrypted data can be decrypted only by re-running the same XOR operation with the same key. In tide, standalone XOR encryption will only employ the the single-byte key 0xA9
.
ECB encryption works by bit-shifting data by the amount specified by some key (a password). Each character in a 'block' of the input data is shifted by the value of the character in the corresponding position within the password. The encrypted data can be decrypted only by shifting it back by the value of the corresponding position within the password. In tide, passwords will be a fixed length of 16 characters.
CBC encryption is different from the above two algorithms as each block of the encrypted data contributes to the encryption of the next block. We will combine both XOR encryption and ECB encryption to develop an encryption algorithm where it is significantly harder for an unauthorised party to read our encrypted data by guessing our password.
However, before all of this, tide needs to be able to function as a basic standalone program. As such, we will implement several filesystem manipulation operations. You will also implement two different methods of searching for files, which will make the user's life easier in finding what they might need to encrypt.
Your task in this assignment is to write tide, a terribly insecure single-file encryption/decryption tool. Throughout this assignment, you will explore some basic filesystem operations, as well as implement several rudimentary encryption algorithms.
Encryption is the process of converting information into an obscured format, which can (in theory), only be converted back into useful information by an authorized party who knows the encryption process and key. Encryption is an incredibly useful tool, and is the reason why the internet can function in the way it does, with sensitive information freely transmitted across it.
File encryption is particularly useful to safeguard data in the case that it is stolen. Encrypting your files could prevent someone from being able to access your photos in the event that your laptop gets stolen.
In this assignment, you will implement three different algorithms for file encryption: XOR (eXclusive OR), ECB (Electronic Code Book) and CBC (Cipher Block Chaining). Each of these algorithms function slightly differently, but all work towards the same purpose of obscuring information, that can only be correctly interpreted by an authorised party.
XOR encryption works by employing the bitwise XOR operation on every bit of some given data. A key, which when broken up into it's constituent bits, expanded to much the length of the data being encrypted. The XOR operation is then employed between these two bitstreams to yield the encrypted data. This encrypted data can be decrypted only by re-running the same XOR operation with the same key. In tide, standalone XOR encryption will only employ the the single-byte key 0xA9
.
ECB encryption works by bit-shifting data by the amount specified by some key (a password). Each character in a 'block' of the input data is shifted by the value of the character in the corresponding position within the password. The encrypted data can be decrypted only by shifting it back by the value of the corresponding position within the password. In tide, passwords will be a fixed length of 16 characters.
CBC encryption is different from the above two algorithms as each block of the encrypted data contributes to the encryption of the next block. We will combine both XOR encryption and ECB encryption to develop an encryption algorithm where it is significantly harder for an unauthorised party to read our encrypted data by guessing our password.
However, before all of this, tide needs to be able to function as a basic standalone program. As such, we will implement several filesystem manipulation operations. You will also implement two different methods of searching for files, which will make the user's life easier in finding what they might need to encrypt.
Getting Started
Create a new directory for this assignment called tide
, change to this directory, and fetch the provided code by running these commands:
mkdir tide
cd tide
1521 fetch tide
If you're not working at CSE, you can download the provided files as a zip file or a tar file.
This will get you tide.c
, which contains code to start the assignment. As provided, it will compile and run, but lacks any real functionality:
make
dcc -Wall -Werror main.c tide.c -o tide
./tide
Welcome to tide!
To see what commands are available, type help.
tide> help
help (h) Prints this help message
pwd (p) Prints the current directory
chdir directory (cd) Changes the current directory
list (ls) Lists the contents of the current directory
test-encryptable filename (t) Tests if a file can be encrypted
xor-contents filename (x) Encrypts a file with simple XOR
encrypt-ecb filename (ee) Encrypts a file with ECB
decrypt-ecb filename (de) Decrypts a file with ECB
search-name search-term (sn) Searches for a file by filename
search-content search-size (sc) Searches for a file by its content for the provided bytes
search-from-file source-file (sf) Searches for a file by its content for the provided bytes, supplied from a file
encrypt-cbc filename (ec) Encrypts a file with CBC
decrypt-cbc filename (dc) Decrypts a file with CBC
quit (q) Quits the program
tide> q
Thanks for using tide. Have a nice day!
However, tide.c
also contains some provided functions to make your task easier. For example, the sort_strings
function will sort an array of strings into alphabetical order in-place. You should read through the provided code in this file before you begin work on this assignment.
You may also find the provided constants, data types and function signatures in tide.h
to be useful.
Create a new directory for this assignment called tide
, change to this directory, and fetch the provided code by running these commands:
mkdir tide cd tide 1521 fetch tide
If you're not working at CSE, you can download the provided files as a zip file or a tar file.
This will get you tide.c
, which contains code to start the assignment. As provided, it will compile and run, but lacks any real functionality:
make dcc -Wall -Werror main.c tide.c -o tide ./tide Welcome to tide! To see what commands are available, type help. tide> help help (h) Prints this help message pwd (p) Prints the current directory chdir directory (cd) Changes the current directory list (ls) Lists the contents of the current directory test-encryptable filename (t) Tests if a file can be encrypted xor-contents filename (x) Encrypts a file with simple XOR encrypt-ecb filename (ee) Encrypts a file with ECB decrypt-ecb filename (de) Decrypts a file with ECB search-name search-term (sn) Searches for a file by filename search-content search-size (sc) Searches for a file by its content for the provided bytes search-from-file source-file (sf) Searches for a file by its content for the provided bytes, supplied from a file encrypt-cbc filename (ec) Encrypts a file with CBC decrypt-cbc filename (dc) Decrypts a file with CBC quit (q) Quits the program tide> q Thanks for using tide. Have a nice day!
However, tide.c
also contains some provided functions to make your task easier. For example, the sort_strings
function will sort an array of strings into alphabetical order in-place. You should read through the provided code in this file before you begin work on this assignment.
You may also find the provided constants, data types and function signatures in tide.h
to be useful.
Reference implementation
A reference implementation is a common, efficient, and effective method to provide or define an operational specification; and it's something you will likely work with after you leave UNSW.
We've provided a reference implementation, 1521 tide
, which you can use to find the correct outputs and behaviours for any input:
1521 tide
Welcome to tide!
To see what commands are available, type help.
tide> help
help (h) Prints this help message
pwd (p) Prints the current directory
chdir directory (cd) Changes the current directory
list (ls) Lists the contents of the current directory
test-encryptable filename (t) Tests if a file can be encrypted
xor-contents filename (x) Encrypts a file with simple XOR
encrypt-ecb filename (ee) Encrypts a file with ECB
decrypt-ecb filename (de) Decrypts a file with ECB
search-name search-term (sn) Searches for a file by filename
search-content search-size (sc) Searches for a file by its content for the provided bytes
search-from-file source-file (sf) Searches for a file by its content for the provided bytes, supplied from a file
encrypt-cbc filename (ec) Encrypts a file with CBC
decrypt-cbc filename (dc) Decrypts a file with CBC
quit (q) Quits the program
tide> q
Thanks for using tide. Have a nice day!
tide also has a colored mode, to make it a bit nicer to use.
1521 tide --colors
Welcome to tide!
To see what commands are available, type help.
tide> q
Thanks for using tide. Have a nice day!
(as tends to be the convention in computing, we use the Americanised spelling of colour here.)Every concrete example shown below is runnable using the reference implementation. Your goal is to make your program run just as the reference implementation does, with identical output and behaviour.
Where any aspect of this assignment is undefined in this specification, you should match the behaviour exhibited by the reference implementation. Discovering and matching the reference implementation's behaviour is deliberately a part of this assignment.
If you discover what you believe to be a bug in the reference implementation, please report it in the course forum. If it is a bug, we may fix the bug; or otherwise indicate that you do not need to match the reference implementation's behaviour in that specific case.
A reference implementation is a common, efficient, and effective method to provide or define an operational specification; and it's something you will likely work with after you leave UNSW.
We've provided a reference implementation, 1521 tide
, which you can use to find the correct outputs and behaviours for any input:
1521 tide Welcome to tide! To see what commands are available, type help. tide> help help (h) Prints this help message pwd (p) Prints the current directory chdir directory (cd) Changes the current directory list (ls) Lists the contents of the current directory test-encryptable filename (t) Tests if a file can be encrypted xor-contents filename (x) Encrypts a file with simple XOR encrypt-ecb filename (ee) Encrypts a file with ECB decrypt-ecb filename (de) Decrypts a file with ECB search-name search-term (sn) Searches for a file by filename search-content search-size (sc) Searches for a file by its content for the provided bytes search-from-file source-file (sf) Searches for a file by its content for the provided bytes, supplied from a file encrypt-cbc filename (ec) Encrypts a file with CBC decrypt-cbc filename (dc) Decrypts a file with CBC quit (q) Quits the program tide> q Thanks for using tide. Have a nice day!
tide also has a colored mode, to make it a bit nicer to use.
1521 tide --colors Welcome to tide! To see what commands are available, type help. tide> q Thanks for using tide. Have a nice day!(as tends to be the convention in computing, we use the Americanised spelling of colour here.)
Every concrete example shown below is runnable using the reference implementation. Your goal is to make your program run just as the reference implementation does, with identical output and behaviour.
Where any aspect of this assignment is undefined in this specification, you should match the behaviour exhibited by the reference implementation. Discovering and matching the reference implementation's behaviour is deliberately a part of this assignment.
If you discover what you believe to be a bug in the reference implementation, please report it in the course forum. If it is a bug, we may fix the bug; or otherwise indicate that you do not need to match the reference implementation's behaviour in that specific case.
tide Examples
Additionally provided for your use is a command 1521 tide-examples
.
When executed, it will create an examples
directory in the current directory, and will create a number of files intended for testing, in this directory. An example of its usage follows:
ls
commands.h escape.h main.c tide.c tide.h Makefile
1521 tide-examples
ls
commands.h escape.h main.c tide.c tide.h Makefile
ls examples
a empty.txt forbidden lorem lorem.txt ro_dir tide_sols.txt
The autotests make heavy use of these examples, so it is recommended to run this command before manually replicating any autotest output you intend to debug.
Additionally provided for your use is a command 1521 tide-examples
.
When executed, it will create an examples
directory in the current directory, and will create a number of files intended for testing, in this directory. An example of its usage follows:
ls commands.h escape.h main.c tide.c tide.h Makefile 1521 tide-examples ls commands.h escape.h main.c tide.c tide.h Makefile ls examples a empty.txt forbidden lorem lorem.txt ro_dir tide_sols.txt
The autotests make heavy use of these examples, so it is recommended to run this command before manually replicating any autotest output you intend to debug.
Your Tasks
This assignment consists of five subsets. Each subset builds on the work of the previous one, and each subset is more complex than the previous one.
It is recommended that you work on each subset in order.
This assignment consists of five subsets. Each subset builds on the work of the previous one, and each subset is more complex than the previous one.
It is recommended that you work on each subset in order.
Subset 0: File and directory commands
For this subset, you will need to implement the following three functions:
void print_current_directory()
void change_directory(char *directory)
void list_current_directory()
All user input is handled for you in main.c
. You will never have to read input from stdin
- all inputs are passed in through arguments to these functions for you.
For this subset, you will need to implement the following three functions:
void print_current_directory()
void change_directory(char *directory)
void list_current_directory()
All user input is handled for you in main.c
. You will never have to read input from stdin
- all inputs are passed in through arguments to these functions for you.
Printing the current directory
You will need to implement print_current_directory
, such that it prints the current directory the program is operating in.
Once you have this function working correctly, your tide implementation should match the following behaviour:
1521 tide-examples
cd examples
1521 tide
Welcome to tide!
To see what commands are available, type help.
tide> p
The current directory is: /home/z5555555/tide/examples
tide> q
Thanks for using tide. Have a nice day!
cd a
1521 tide
Welcome to tide!
To see what commands are available, type help.
tide> p
The current directory is: /home/z5555555/tide/examples/a
tide> q
Thanks for using tide. Have a nice day!
(Your home directory will, of course, feature your zID instead, and your overall path may be a little different!)
You will need to implement print_current_directory
, such that it prints the current directory the program is operating in.
Once you have this function working correctly, your tide implementation should match the following behaviour:
1521 tide-examples cd examples 1521 tide Welcome to tide! To see what commands are available, type help. tide> p The current directory is: /home/z5555555/tide/examples tide> q Thanks for using tide. Have a nice day! cd a 1521 tide Welcome to tide! To see what commands are available, type help. tide> p The current directory is: /home/z5555555/tide/examples/a tide> q Thanks for using tide. Have a nice day!
(Your home directory will, of course, feature your zID instead, and your overall path may be a little different!)
Changing directories
You will need to implement change_directory
so that it changes the current working directory of tide. All other functions should now operate on the new working directory.
Supplied directories can be both relative and absolute. Additionally, your implementation should expand ~ into the user's home directory, using the HOME
environment variable.
Once you have this function working correctly, your tide implementation should match the following behaviour:
1521 tide-examples
cd examples
1521 tide
Welcome to tide!
To see what commands are available, type help.
tide> cd a
Moving to a
tide> p
The current directory is: /home/z5555555/tide/examples/a
tide> cd ..
Moving to ..
tide> p
The current directory is: /home/z5555555/tide/examples
tide> cd this_doesnt_exist
Could not change directory.
tide> q
Thanks for using tide. Have a nice day!
You will need to implement change_directory
so that it changes the current working directory of tide. All other functions should now operate on the new working directory.
Supplied directories can be both relative and absolute. Additionally, your implementation should expand ~ into the user's home directory, using the HOME
environment variable.
Once you have this function working correctly, your tide implementation should match the following behaviour:
1521 tide-examples cd examples 1521 tide Welcome to tide! To see what commands are available, type help. tide> cd a Moving to a tide> p The current directory is: /home/z5555555/tide/examples/a tide> cd .. Moving to .. tide> p The current directory is: /home/z5555555/tide/examples tide> cd this_doesnt_exist Could not change directory. tide> q Thanks for using tide. Have a nice day!
Listing the current directory
list_current_directory
should print every file and folder in tide's working directory, along with its permissions. The output of this function should be sorted.
Fortunately, a sort function has been provided for you! Calling sort_strings
sorts the supplied array of strings in-place. You can then print the now-sorted array in order, in the required format.
Once you have this function working correctly, your tide implementation should match the following behaviour:
1521 tide-examples
cd examples
1521 tide
Welcome to tide!
To see what commands are available, type help.
tide> ls
drwxr-xr-x .
drwxr-xr-x ..
drwxr-xr-x a
-rw-r--r-- empty.txt
drwxr-xr-x forbidden
drwxr-xr-x lorem
-rw-r--r-- lorem.txt
drwxr-xr-x ro_dir
-rw-r--r-- tide_sols.txt
tide> q
Thanks for using tide. Have a nice day!
(Your ..
directory's permissions may look a little different!)
list_current_directory
should print every file and folder in tide's working directory, along with its permissions. The output of this function should be sorted.
Fortunately, a sort function has been provided for you! Calling sort_strings
sorts the supplied array of strings in-place. You can then print the now-sorted array in order, in the required format.
Once you have this function working correctly, your tide implementation should match the following behaviour:
1521 tide-examples cd examples 1521 tide Welcome to tide! To see what commands are available, type help. tide> ls drwxr-xr-x . drwxr-xr-x .. drwxr-xr-x a -rw-r--r-- empty.txt drwxr-xr-x forbidden drwxr-xr-x lorem -rw-r--r-- lorem.txt drwxr-xr-x ro_dir -rw-r--r-- tide_sols.txt tide> q Thanks for using tide. Have a nice day!(Your
..
directory's permissions may look a little different!)Testing this subset
You can test and submit this subset with:
1521 autotest tide subset0
give cs1521 ass2_tide tide.c [other .c or .h files]
You can test and submit this subset with:
1521 autotest tide subset0 give cs1521 ass2_tide tide.c [other .c or .h files]
Testing
You are expected to do your own testing. Some autotests are available to help you get started:
1521 autotest tide
You can create extra .c
or .h
files; but you will need to supply them explicitly to autotest, as follows:
1521 autotest tide [other .c or .h files]
You are expected to do your own testing. Some autotests are available to help you get started:
1521 autotest tide
You can create extra .c
or .h
files; but you will need to supply them explicitly to autotest, as follows:
1521 autotest tide [other .c or .h files]
Assumptions and Clarifications
Like all good programmers, you should make as few assumptions as possible.
Your submitted code must be written in C. You may not submit code in other languages.
You should avoid leaking memory wherever possible.
All supplied error messages should be printed to stdout
.
You can call functions from the C standard library available by default on CSE Linux systems: including, e.g., stdio.h
, stdlib.h
, string.h
, assert.h
.
Additionally, you may call functions from the C POSIX libraries available on CSE Linux systems: including, e.g., unistd.h
, sys/stat.h
, dirent.h
.
You may not create subprocesses: you may not use posix_spawn, posix_spawnp, system, popen, fork, vfork, clone, or any of the exec*
family of functions, like execve.
You may not create or use temporary files.
tide only has to handle ordinary files and directories.
tide does not have to handle symbolic links, devices or other special files.
tide will not be given directories containing symbolic links, devices or other special files.
tide does not have to handle hard links.
If you need clarification on what you can and cannot use or do for this assignment, please ask in the course forum.
You are required to submit intermediate versions of your assignment. See below for details.
Your program must not require extra compile options. It must compile with dcc *.c -o tide
, and it will be run with dcc
when marking. Run-time errors from illegal C will cause your code to fail automarking.
If your program writes out debugging output, it will fail automarking tests. Make sure you disable debugging output before submission.
Like all good programmers, you should make as few assumptions as possible.
Your submitted code must be written in C. You may not submit code in other languages.
You should avoid leaking memory wherever possible.
All supplied error messages should be printed to
stdout
.You can call functions from the C standard library available by default on CSE Linux systems: including, e.g.,
stdio.h
,stdlib.h
,string.h
,assert.h
.Additionally, you may call functions from the C POSIX libraries available on CSE Linux systems: including, e.g.,
unistd.h
,sys/stat.h
,dirent.h
.You may not create subprocesses: you may not use posix_spawn, posix_spawnp, system, popen, fork, vfork, clone, or any of the
exec*
family of functions, like execve.You may not create or use temporary files.
tide only has to handle ordinary files and directories.
tide does not have to handle symbolic links, devices or other special files.
tide will not be given directories containing symbolic links, devices or other special files.
tide does not have to handle hard links.
If you need clarification on what you can and cannot use or do for this assignment, please ask in the course forum.
You are required to submit intermediate versions of your assignment. See below for details.
Your program must not require extra compile options. It must compile with dcc *.c -o tide
, and it will be run with dcc
when marking. Run-time errors from illegal C will cause your code to fail automarking.
If your program writes out debugging output, it will fail automarking tests. Make sure you disable debugging output before submission.