Ruan Bekker's Blog

From a Curious mind to Posts on Github

Parallel Processing on AWS Lambda With Python Using Multiprocessing

If you are trying to use multiprocessing.Queue or multiprocessing.Pool on AWS Lambda, you are probably getting the exception:

1
2
3
4
[Errno 38] Function not implemented: OSError

    sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue)
OSError: [Errno 38] Function not implemented

The reason for that is due to the Lambda execution environment not having support on shared memory for processes, therefore you can’t use multiprocessing.Queue or multiprocessing.Pool.

As a workaround, Lambda does support the usage of multiprocessing.Pipe instead of Queue.

Parallel Processing on Lambda Example

Below is a very basic example on how you would achieve the task of executing parallel processing on AWS Lambda for Python:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
import time
import multiprocessing

region_maps = {
        "eu-west-1": {
            "dynamodb":"dynamodb.eu-west-1.amazonaws.com"
        },
        "us-east-1": {
            "dynamodb":"dynamodb.us-east-1.amazonaws.com"
        },
        "us-east-2": {
            "dynamodb": "dynamodb.us-east-2.amazonaws.com"
        }
    }

def multiprocessing_func(region):
    time.sleep(1)
    endpoint = region_maps[region]['dynamodb']
    print('endpoint for {} is {}'.format(region, endpoint))

def lambda_handler(event, context):
    starttime = time.time()
    processes = []
    regions = ['us-east-1', 'us-east-2', 'eu-west-1']
    for region in regions:
        p = multiprocessing.Process(target=multiprocessing_func, args=(region,))
        processes.append(p)
        p.start()

    for process in processes:
        process.join()

    output = 'That took {} seconds'.format(time.time() - starttime)
    print(output)
    return output

The output when the function gets invoked:

1
2
3
4
pid: 30913 - endpoint for us-east-1 is dynamodb.us-east-1.amazonaws.com
pid: 30914 - endpoint for us-east-2 is dynamodb.us-east-2.amazonaws.com
pid: 30915 - endpoint for eu-west-1 is dynamodb.eu-west-1.amazonaws.com
That took 1.014902114868164 seconds

Thank You

Please feel free to show support by, sharing this post, making a donation, subscribing or reach out to me if you want me to demo and write up on any specific tech topic.


Comments