Creating a AWS S3 service with Python
Overview
In a previous module we use the boto3 library to connect our python script with an AWS service. Here, we are going continue down this path by taking a look at some operations we can perform using AWS S3 and python. We will create a storage service with python that interfaces with S3 and give us a chance to use some common operations.
Initiate the Environment
We will use a virtual environment to allow us to develop our code in an isolated environment. Developing in an virual environment allows us to manage our projects independently, making for a more portability and less coupled to our development machine.
-
To create a new environment named sample-env execute:
$ python3 -m venv ~/python-envs/sample-boto-s3
-
To activate the environment execute:
$ source ~/python-envs/sample-boto-s3/bin/activate
-
Install BOT03 package using:
$ pip3 install boto3
Setting up the project
By setting up access to S3 as a service, it will encapsulate the code need to interact with S3. This will in turn make the service more reusable and portable for other projects.
- Open up your favorite Python IDE and lets get to the good stuff...code!
- Create a new python script file called storage-service.py and save it to the sample-boto-s3 folder created for the virtual environment previously.
Create the Serivice
Open storage_service.py
and enter the following:
import boto3 // (1)
class StoreageService
def __init__(self, storage_location): // (2)
self.client = boto3.client('s3') // (3)
self.bucket_name = storage_location // (4)
- In order to leverage the boto3 library we must import it first
- Passing the storage location at the time of constructions allows for the decoupling of the storage url from class code. When a StorageService object is created the bucket location will be passed into the constructor, allowing for multiple storage services to exist and represent different storage locations
- Using
boto3.client('s3')
returns a S3 client object that can be used to interact with the S3 cloud service - Assign the bucket name to an object variable. This will be uuse later to configure the S3 client
The boto3
object becomes available because of the import statement at the top of the file. This object will produce a client object that can interface with the service passed as an argument, in this case 's3'
Uploading Files
The first behaviour to add to this service will be the ability to upload files to the specified bucket. The behaviour of this method is to upload a local file to a S3 bucket.
def upload_file(self, file_name, object_name=None): //(1)
if object_name is None: //(2)
object_name = file_name
response = self.client.upload_file(file_name, self.bucket_name, object_name) //(3)
return response //(4)
- A method that takes a
file_name
parameter that represents the local location of the file to be uploaded to S3.object_name
is an opional parameter to rename the file on S3 - If value is found for renaming the file, the current file name will be used
- The
upload_file()
method is called the the S3 client and proper values are passed.- Param 1 - path to local file
- Param 2 - name of bucket to up load
- Param 3 - name of object on S3
- Return the response sent from the API call (if any)
Run the code
Create a file to upload
touch File.txt
Next, create a python script to instantiate and run the storage service
service_runner.py
from storage_service import StorageService //(1)
storage_service = StorageService("your.first.boto.s3.bucket") //(2)
print(storage_service.upload_file('file.txt')) //(3)
- Import the storage service
- Instantiate a new service with a bucket location
- Call the service and using the text file to upload
Notice nothing is return from the AWS api, but this is not the case always.
Log in to AWS Web Console to confirm the file was uploaded successfully.
Downloading Files
This method will use the name of an S3 object to retrieve it from the cloud and save to the local system
def download_object(self, object_name, file_name=None): //(1)
if file_name is None:
file_name = object_name
response = self.client.download_file(self.bucket_name, object_name, file_name) //2
return response
- This method takes
object_name
to retrieve the object from S3 andfile_name
as an option parameter to rename the file when saving it locally - The
download_file()
method will be supplied the following values- Param 1 - Name of bucket to access
- Param 2 - Name of object to retrieve from S3
- Param 3 - What name to use for file on the local system
Run the Code
service_runner.py
print(storage_service.download_object('file.txt', 'file_s3.txt'))
There should now be a new file saved locally named "file_s3.txt"
Listing Files
The ability for the storage service to return a list of all the object in the bucket could be useful
def list_all_objects(self):
objects = self.client.list_objects(Bucket=self.bucket_name) //(1)
if "Contents" in objects: //(2)
response = objects["Contents"]
else:
response = {}
return response
list_objects()
only needs a bucket name to be supplied as a parameter. This will return a lengthy json object that will be saved stored as a dictionary <key, value>- The bit we are are interested in has a key of 'Contents'. This contains an array of S3 objects. The problem is, if it's an empty bucket then the 'Contents' key will absent from the json. So check first if the key exist and then assign the response. If the bucket is empty an empty dictionary is returned
Run the Code
service_runner.py
s3Objects = storage_service.list_all_objects()
for file in s3Objects:
print(file["Key"])
list_all_obj
returns a list of dictionaries that has information about the objects contained in the bucket. The object names are found unther the 'Key' key in the dictionary
Deleting Files
Finally there should be a way to remove objects from the bucket.
def delete_object(self, object_name):
response = self.client.delete_object(Bucket=self.bucket_name, Key=object_name) //(1)
return response
The delete_object()
method takes two arguments:
- Param 1 - bucket to access
- Param 2 - S3 object to delete
Run the code
service_runner.py
print(storage_service.delete_object('file1.txt'))
Conclusion
Using the boto3 S3 client, we built a simple storage service to manage file transfers to and from a specified bucket. We took care to build it in a way that we can reuse this service and its code throughout any project we wish to add the service to. If we ever need to add or remove functionality the service code can easily implement these changes.