Unit Testing: serverless, python lambda, and pymysql on top of aws
Have been digging into some serverless, and specifically providing functionality through aws api gateway. While it seems relatively straightforward to provide apis that query a mysql database, what does not seem straightforward is the ability to unit test the python lambda.
There are specific tools to mock the pymysql connect method (https://www.google.com/search?q=mock+pymysql.connect), but this is just the first part of the problem.
if we were to follow the template described by https://www.isc.upenn.edu/accessing-mysql-databases-aws-python-lambda-function, we get the following in our python lambda:
We also added a connection ping to make sure to keep the startup time quick, but within the lambda_handler. From what I am able to determine, this allows is a form of connection pooling, but am not an expert in this area.
The question is, how would you unit test lambda_handler? In a unit test, you would import the system under test (sut), which would execute the pymysql.connect in the above code.
I think of this as something like a java static initialization code - it will always be executed and in a unit test case, the connection cannot be made. One option is to know if the function is running in a unit test, such as if __name__ == '__main__' approach, where we would know if we were running as a unit test. While this does work for a unit test, it does _not_ work for lambda functions as the lambda function does not run as __main__.
there has been many discussions about including unit test code within a production codebase, and I am inclined to lean towards keeping all unit test code out of production code, but there is no consensus on this.
So putting some sort of unit test awareness is blocked on two fronts.
There are two real problems
1) the initial connect and the sys.exit() call
2) the ping function called within the lambda
after digging into this pretty deeply, but from a java/.net perspective, I was looking for something from the mocking library to handle these calls.
So, how about the ping call. What if that ping call could be mocked as its own method. So first, move the ping into its own method, something like this (from the lambda index.py):
then the use the unittest library to patch the lambda method (tests.py)
this allows the database call methods to be patched in the unit test execution, including the ability to return a specific response on the executeQuery method. This actually seems to work pretty well, with the exception that the i had to set the return value on *all* of the mocked methods in order to return for a single method. a bit strange there.
Ok, this seems to be working pretty well, how about the static code. It does not seem like there is a way to mock this connection *in this case*. Specifically, the connection code will be called _prior_ to the mock code since the function will be imported. That is, we would need to move this initialization code into a method so that it wouldn't be executed and then we could mock like the above methods. Unfortunately, this removes the value of having this type of code (and therefore poor mans connection pooling).
How about removing the specific problem, the sys.exit() call. That is, the unit test will print an error, but execution will continue.
This approach seems to work, but definitely has some issues. One that immediately becomes apparent is that any exception should do more than just print an error. This code would be executed. Along with a few other problems, this is definitely not a final solution, but maybe a first step.
There are specific tools to mock the pymysql connect method (https://www.google.com/search?q=mock+pymysql.connect), but this is just the first part of the problem.
if we were to follow the template described by https://www.isc.upenn.edu/accessing-mysql-databases-aws-python-lambda-function, we get the following in our python lambda:
#!/usr/bin/python import sys import logging import rds_config import pymysql rds_host = rds_config.db_endpoint name = rds_config.db_username password = rds_config.db_password db_name = rds_config.db_name port = 3306 logger = logging.getLogger() logger.setLevel(logging.INFO) try: conn = pymysql.connect(rds_host, user=name, passwd=password, db=db_name, connect_timeout=5) except: logger.error("ERROR: Unexpected error: Could not connect to MySql instance.") sys.exit() logger.info("SUCCESS: Connection to RDS mysql instance succeeded") def lambda_handler(event, context): item_count = 0
We also added a connection ping to make sure to keep the startup time quick, but within the lambda_handler. From what I am able to determine, this allows is a form of connection pooling, but am not an expert in this area.
The question is, how would you unit test lambda_handler? In a unit test, you would import the system under test (sut), which would execute the pymysql.connect in the above code.
I think of this as something like a java static initialization code - it will always be executed and in a unit test case, the connection cannot be made. One option is to know if the function is running in a unit test, such as if __name__ == '__main__' approach, where we would know if we were running as a unit test. While this does work for a unit test, it does _not_ work for lambda functions as the lambda function does not run as __main__.
there has been many discussions about including unit test code within a production codebase, and I am inclined to lean towards keeping all unit test code out of production code, but there is no consensus on this.
So putting some sort of unit test awareness is blocked on two fronts.
There are two real problems
1) the initial connect and the sys.exit() call
2) the ping function called within the lambda
after digging into this pretty deeply, but from a java/.net perspective, I was looking for something from the mocking library to handle these calls.
So, how about the ping call. What if that ping call could be mocked as its own method. So first, move the ping into its own method, something like this (from the lambda index.py):
def pingConnection():
try:
connection.ping(reconnect=True)
except Exception as e:
print(e)
then the use the unittest library to patch the lambda method (tests.py)
@patch('index.executeQuery')
@patch('index.pingConnection')
@patch('index.closeConnection')
class TestHandler(unittest.TestCase):
def test_status_code(self, executeQuery_mock, pingConnection_mock, closeConnection_mock):
# Arrange
executeQuery_mock.return_value = ["3D49717D5FDC11E9BFE00601CE985762", "MSRVQ"]
pingConnection_mock.return_value = ["3D49717D5FDC11E9BFE00601CE985762", "MSRVQ"]
closeConnection_mock.return_value = ["3D49717D5FDC11E9BFE00601CE985762", "MSRVQ"]
this allows the database call methods to be patched in the unit test execution, including the ability to return a specific response on the executeQuery method. This actually seems to work pretty well, with the exception that the i had to set the return value on *all* of the mocked methods in order to return for a single method. a bit strange there.
Ok, this seems to be working pretty well, how about the static code. It does not seem like there is a way to mock this connection *in this case*. Specifically, the connection code will be called _prior_ to the mock code since the function will be imported. That is, we would need to move this initialization code into a method so that it wouldn't be executed and then we could mock like the above methods. Unfortunately, this removes the value of having this type of code (and therefore poor mans connection pooling).
How about removing the specific problem, the sys.exit() call. That is, the unit test will print an error, but execution will continue.
This approach seems to work, but definitely has some issues. One that immediately becomes apparent is that any exception should do more than just print an error. This code would be executed. Along with a few other problems, this is definitely not a final solution, but maybe a first step.
Comments