Parallel Processing on AWS Lambda With Python Using Multiprocessing
If you are trying to use multiprocessing.Queue or multiprocessing.Pool on AWS Lambda, you are probably getting the exception:
1234
[Errno 38] Function not implemented: OSError
sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue)
OSError: [Errno 38] Function not implemented
The reason for that is due to the Lambda execution environment not having support on shared memory for processes, therefore you can’t use multiprocessing.Queue or multiprocessing.Pool.
As a workaround, Lambda does support the usage of multiprocessing.Pipe instead of Queue.
Parallel Processing on Lambda Example
Below is a very basic example on how you would achieve the task of executing parallel processing on AWS Lambda for Python:
importtimeimportmultiprocessingregion_maps={"eu-west-1":{"dynamodb":"dynamodb.eu-west-1.amazonaws.com"},"us-east-1":{"dynamodb":"dynamodb.us-east-1.amazonaws.com"},"us-east-2":{"dynamodb":"dynamodb.us-east-2.amazonaws.com"}}defmultiprocessing_func(region):time.sleep(1)endpoint=region_maps[region]['dynamodb']print('endpoint for {} is {}'.format(region,endpoint))deflambda_handler(event,context):starttime=time.time()processes=[]regions=['us-east-1','us-east-2','eu-west-1']forregioninregions:p=multiprocessing.Process(target=multiprocessing_func,args=(region,))processes.append(p)p.start()forprocessinprocesses:process.join()output='That took {} seconds'.format(time.time()-starttime)print(output)returnoutput
Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.