Information Security, Web, Networks and Systems

Tuesday, July 8, 2014

Encrypted File Transfer utility in Python

6:45 AM Posted by Deepal , , , , , , , 3 comments
In this post I am going to describe implementation of a Secure File transfer utility for large files which preserves the Confidentiality and Integrity using Encryption and Hashing. I'll be describing the design and implementation of the utility and the source code.
Here is the link for the complete python script at github.

Design of the Utility

The requirement of this utility is to transfer “large files” between hosts securely. To provide security including Confidentiality and Integrity, we need a strong cryptographic mechanism to encrypt the data we send. To prevent the key distribution problem, we generally use Public Key Encryption for encrypting data. But, tradeoff using Public Key Encryption to encrypt large files is the Performance Bottleneck to encrypt and decrypt data. So, clearly we need a Symmetric key encryption mechanism to encrypt these large files before transferring. Main issue of using a Symmetric cipher to encrypt data is protecting the secrecy of the Key which is used for symmetric encryption. To solve both of these performance problem and key distribution problem, I have used AES (Advanced Encryption Standard) 256 bit Encryption to encrypt the file and RSA public key encryption to encrypt the Symmetric AES key.

Client functionality

This file transfer utility has two types of functionality as client and the server. In the client mode, this utility behaves as a FTP client which has the capability to encrypt the file to be transferred and send. Following is the process client follows to encrypt and transfer the file.

Calculate the MD5 hash of the file to be sent. This is transferred to the server and the server used this to validate the integrity of the file. 
Generate a 32-byte key using a Random Number Generator which is used as the AES key. 
Encrypt the file using above generated random key with AES 256 bit Encryption. 
Encrypt the 32 bit AES key using the Server’s public key. 
Prepend the calculated MD5 hash and the Encrypted AES key to the Encrypted file. Final structure of the file to be transferred will be as follows. 
Then the IV (Initialization vector which is used by the AES CBC mode of operation) is appended to the file (will be described later in this post)

Above created file will be transferred to the Server.

Server Functionality

Server runs as a FTP server listening on the default FTP port (21). Once a file received by the server, server triggers the decrypt() function to decrypt the received file. Following is the decryption and integrity validation process followed by the server.

In the process of decryption, server extracts first 32 bytes as the original file hash, second 512 bytes as the Encrypted AES symmetric key and the rest of the file as the Encrypted file data.
Server uses its private key to decrypt the encrypted AES symmetric key.
Server then used the decrypted AES symmetric key to decrypt the Encrypted file.
Finally, decrypted file’s MD5 hash digest is calculated and compared with the MD5 hash value sent along with the encrypted file. If two hashes are equal, integrity validation passes. Otherwise, server recognizes the file integrity failed and deletes the file.

Server is designed to accept anonymous logins. And also anonymous users are given enough permissions to transfer files to the server using raw FTP operations.

Programming background

Utility is developed completely using Python. Pyftpdlib and ftplib are used for FTP server and the FTP client functionalities and PyCrypto library is used for Cryptographic functionality including AES and RSA encryption.

Cryptography

Encryption

AES 256 bit Encryption is used with a 32 byte key (256bit) as the symmetric key. The file to be sent is chunked into fixed sized blocks of the size multiple of 16 since AES Block Cipher standard uses 16 bytes blocks. When chunked, any blocks which does not fit into a size multiple of 16 bytes are padded appropriately with spaces. CBC (Cipher Block Chaining) is used as the AES Mode of Operation. CBC uses a 16 byte IV (Initialization Vector) which is generated randomly using Python’s Random module. IV is also appended to the encrypted file which is used for decryption.

RSA public key encryption is used to encrypt the AES symmetric key. This encrypted symmetric key also appended to the encrypted file to be decrypted by the other end. I have used RSAES-OAEP (Optimal Asymmetric Encryption Padding) to create an encryption cipher for RSA using the RSA public key.

Decryption

Once the server receives the file from the client, server extracts different portions from the file including, MD5 hash of the file, Encrypted AES symmetric key and the actual encrypted file. Then the server uses RSA encryption with OAEP to recover the encrypted symmetric key. Using the recovered AES symmetric key, server decrypts the encrypted file using AES decryption. However, the decrypted file is bit different than what client actually encrypted because it has been padded with spaces for AES encryption. Then the server truncates the decrypted file to its original size.

Validating integrity

MD5 hashing is used to validate the integrity of the received file. Server calculates the MD5 hash of the decrypted and truncated file and compared with the MD5 hash the client has sent appended to the file. Python’s hashlib library is used in this program to calculate MD5 hash of the files. If the two MD5 hashes are equal, integrity is protected. FTP server then sends the results of the integrity validation through a separate socket connection to a server which client is running on a specific port. When the client receives the integrity validation results, client displays the result and terminates the client program with successful file transfer.

Performance Analysis

I have used AES 256 bit encryption to encrypt the files at the client’s end. However, since this tool is designed to transfer large files, there’s a tradeoff between the Performance and Secure encryption. Although AES-256 encryption provides better encryption than AES-128, encrypting large files with AES 256 encryption can affect the performance of the utility since 256 bit encryption is more advanced than 128 bit encryption. But this can be prevented if a 16 byte key is used for encryption.

I am using CBC (Cipher Block Chaining) as the AES mode of operation. One problem here is, in CBC mode, the cypher text of one block is used to encrypt the next block of the file. So the encryption cannot be parallelized. But in decryption, a plaintext block can be recovered from two adjacent blocks of cipher text. So the encryption can be parallelized and performance can be improved in decryption.

Place comments if there's any correction to be made or you need any clarifications.

Thank you.

References:
http://eli.thegreenplace.net/2010/06/25/aes-encryption-of-files-in-python-with-pycrypto/
https://code.google.com/p/pyftpdlib/wiki/Tutorial
https://launchkey.com/docs/api/encryption


3 comments:

  1. A very good article. Clearly explains the underlying concepts and how to implement. Keep up the good work!

    ReplyDelete
  2. Thanks for sharing useful information about Encrypted File Transfer. visit: https://www.tecnostore-group.com/products/dasdex.html

    ReplyDelete

Note: Only a member of this blog may post a comment.