Data persistence with AWS Neptune

One of the services that I’m using from Localstack is AWS Neptune. Today I started looking into enabling data persistence for Localstack. Unfortunately I stumbled upon an issue, AWS Neptune doesn’t seem to start/work correctly at all.

Here’s my docker-compose.yml configuration for localstack:

localstack:
    container_name: "${LOCALSTACK_DOCKER_NAME:-localstack-main}"
    image: localstack/localstack-pro
    ports:
      - "127.0.0.1:4566:4566"
      - "127.0.0.1:4510-4559:4510-4559"
    environment:
      - LOCALSTACK_AUTH_TOKEN=${LOCALSTACK_AUTH_TOKEN}
      - SERVICES=s3,dynamodb,neptune,sqs
      - AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
      - AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
      - PERSISTENCE=1
      - SNAPSHOT_FLUSH_INTERVAL=180
    volumes:
      - "${LOCALSTACK_VOLUME_DIR:-./volume}:/var/lib/localstack"

Here are the logs after performing a docker compose up:

localstack-main  |
localstack-main  | LocalStack version: 3.3.1.dev20240417184608
localstack-main  | LocalStack build date: 2024-04-17
localstack-main  | LocalStack build git hash: 94f3899
localstack-main  |
localstack-main  | 2024-06-06T10:12:23.681  INFO --- [  MainThread] l.bootstrap.licensingv2    : Successfully activated cached license ...... from /var/lib/localstack/cache/license.json 🔑✅
localstack-main  | 2024-06-06T10:12:24.125  WARN --- [  MainThread] l.p.snapshot.plugins       : registering ON_REQUEST load strategy: this strategy has known limitations to not restore state correctly for certain services
localstack-main  | 2024-06-06T10:12:24.455  INFO --- [  MainThread] l.p.snapshot.plugins       : registering SCHEDULED save strategy
localstack-main  | 2024-06-06T10:12:24.471  INFO --- [  MainThread] l.extensions.platform      : loaded 0 extensions
localstack-main  | 2024-06-06T10:12:24.477  INFO --- [-functhread4] hypercorn.error            : Running on https://0.0.0.0:4566 (CTRL + C to quit)
localstack-main  | 2024-06-06T10:12:24.477  INFO --- [-functhread4] hypercorn.error            : Running on https://0.0.0.0:4566 (CTRL + C to quit)
localstack-main  | 2024-06-06T10:12:24.477  INFO --- [-functhread4] hypercorn.error            : Running on https://0.0.0.0:443 (CTRL + C to quit)
localstack-main  | 2024-06-06T10:12:24.477  INFO --- [-functhread4] hypercorn.error            : Running on https://0.0.0.0:443 (CTRL + C to quit)
localstack-main  | 2024-06-06T10:12:24.630  INFO --- [   asgi_gw_0] l.services.s3.v3.provider  : Using /var/lib/localstack/state/s3 as storage path for s3 assets
localstack-main  | 2024-06-06T10:12:24.670  INFO --- [   asgi_gw_0] l.persistence.manager      : Loading state for s3 took 82 ms
localstack-main  | 2024-06-06T10:12:24.671  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS s3.ListBuckets => 200
localstack-main  | Ready.
localstack-main  | 2024-06-06T10:12:25.439  INFO --- [   asgi_gw_0] l.persistence.manager      : Loading state for dynamodb took 755 ms
localstack-main  | 2024-06-06T10:12:25.465  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS dynamodb.ListTables => 200
localstack-main  | 2024-06-06T10:12:25.920  WARN --- [   asgi_gw_1] localstack.utils.functions : error calling function load: Unsupported RDS DB engine type: neptune
localstack-main  | 2024-06-06T10:12:25.920  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS rds.DescribeDBClusters => 200
localstack-main  | 2024-06-06T10:12:25.951  INFO --- [   asgi_gw_0] l.persistence.manager      : Loading state for sqs took 23 ms
localstack-main  | 2024-06-06T10:12:25.955  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS sqs.ListQueues => 200
localstack-main  | 2024-06-06T10:12:26.784  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS sqs.GetQueueAttributes => 200
localstack-main  | 2024-06-06T10:12:36.795  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS sqs.ReceiveMessage => 200
localstack-main  | 2024-06-06T10:12:46.810  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS sqs.ReceiveMessage => 200
localstack-main  | 2024-06-06T10:12:56.827  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS sqs.ReceiveMessage => 200
localstack-main  | 2024-06-06T10:13:06.842  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS sqs.ReceiveMessage => 200
localstack-main  | 2024-06-06T10:13:16.855  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS sqs.ReceiveMessage => 200
localstack-main  | 2024-06-06T10:13:26.870  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS sqs.ReceiveMessage => 200
localstack-main  | 2024-06-06T10:13:37.891  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS sqs.ReceiveMessage => 200
localstack-main  | 2024-06-06T10:13:47.908  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS sqs.ReceiveMessage => 200

When I remove the PERSISTANCE setting from the docker compose, it starts to work without any issues:

localstack-main  |
localstack-main  | LocalStack version: 3.3.1.dev20240417184608
localstack-main  | LocalStack build date: 2024-04-17
localstack-main  | LocalStack build git hash: 94f3899
localstack-main  |
localstack-main  | 2024-06-06T10:16:40.172  INFO --- [  MainThread] l.bootstrap.licensingv2    : Successfully activated cached license ..... from /var/lib/localstack/cache/license.json 🔑✅
localstack-main  | 2024-06-06T10:16:40.921  INFO --- [  MainThread] l.extensions.platform      : loaded 0 extensions
localstack-main  | 2024-06-06T10:16:40.927  INFO --- [-functhread4] hypercorn.error            : Running on https://0.0.0.0:4566 (CTRL + C to quit)
localstack-main  | 2024-06-06T10:16:40.927  INFO --- [-functhread4] hypercorn.error            : Running on https://0.0.0.0:4566 (CTRL + C to quit)
localstack-main  | 2024-06-06T10:16:40.927  INFO --- [-functhread4] hypercorn.error            : Running on https://0.0.0.0:443 (CTRL + C to quit)
localstack-main  | 2024-06-06T10:16:40.927  INFO --- [-functhread4] hypercorn.error            : Running on https://0.0.0.0:443 (CTRL + C to quit)
localstack-main  | Ready.
localstack-main  | 2024-06-06T10:16:42.004  INFO --- [   asgi_gw_0] l.services.s3.v3.provider  : Using /tmp/localstack/state/s3 as storage path for s3 assets
localstack-main  | 2024-06-06T10:16:42.150  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS s3.ListBuckets => 200
localstack-main  | 2024-06-06T10:16:42.153  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS s3.CreateBucket => 200
localstack-main  | 2024-06-06T10:16:42.870  INFO --- [   asgi_gw_0] localstack.utils.bootstrap : Execution of "require" took 707.00ms
localstack-main  | 2024-06-06T10:16:43.007  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS dynamodb.ListTables => 200
localstack-main  | 2024-06-06T10:16:43.066  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS dynamodb.CreateTable => 200
localstack-main  | 2024-06-06T10:16:43.076  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS dynamodb.DescribeTable => 200
localstack-main  | 2024-06-06T10:16:43.094  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS dynamodb.CreateTable => 200
localstack-main  | 2024-06-06T10:16:43.098  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS dynamodb.DescribeTable => 200
localstack-main  | 2024-06-06T10:16:43.103  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS dynamodb.UpdateTimeToLive => 200
localstack-main  | 2024-06-06T10:16:43.712  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS rds.DescribeDBClusters => 404 (DBClusterNotFoundFault)
localstack-main  | 2024-06-06T10:16:43.759  INFO --- [functhread25] l.s.neptune.tinkerpop      : Starting Neptune DB instance on port 4510
localstack-main  | 2024-06-06T10:16:46.269  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS rds.CreateDBCluster => 200
localstack-main  | 2024-06-06T10:16:46.283  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS rds.DescribeDBClusters => 200
localstack-main  | 2024-06-06T10:16:46.385  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS sqs.ListQueues => 200
localstack-main  | 2024-06-06T10:16:46.388  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS sqs.CreateQueue => 200
localstack-main  | 2024-06-06T10:16:47.180  INFO --- [   asgi_gw_0] localstack.request.aws     : AWS sqs.GetQueueAttributes => 200
localstack-main  | 2024-06-06T10:16:57.191  INFO --- [   asgi_gw_1] localstack.request.aws     : AWS sqs.ReceiveMessage => 200

And this is the proof that I can’t run gremlin queries because it can’t connect to it, most probably because the service is not running:

  File "/usr/src/app/app/lib/tasks.py", line 64, in assign_profiles_to_clusters
    .next()
     ^^^^^^
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/process/traversal.py", line 117, in next
    return self.__next__()
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/process/traversal.py", line 48, in __next__
    self.traversal_strategies.apply_strategies(self)
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/process/traversal.py", line 701, in apply_strategies
    traversal_strategy.apply(traversal)
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/driver/remote_connection.py", line 78, in apply
    remote_traversal = self.remote_connection.submit(traversal.bytecode)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/driver/driver_remote_connection.py", line 104, in submit
    result_set = self._client.submit(bytecode, request_options=self._extract_request_options(bytecode))
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/driver/client.py", line 169, in submit
    return self.submit_async(message, bindings=bindings, request_options=request_options).result()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/driver/client.py", line 204, in submit_async
    return conn.write(message)
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/driver/connection.py", line 60, in write
    self.connect()
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/driver/connection.py", line 50, in connect
    self._transport.connect(self._url, self._headers)
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/driver/aiohttp/transport.py", line 79, in connect
    self._loop.run_until_complete(async_connect())
  File "/usr/local/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/gremlin_python/driver/aiohttp/transport.py", line 69, in async_connect
    self._websocket = await self._client_session.ws_connect(url, **self._aiohttp_kwargs, headers=headers)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/client.py", line 779, in _ws_connect
    resp = await self.request(
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/client.py", line 536, in _request
    conn = await self._connector.connect(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/connector.py", line 540, in connect
    proto = await self._create_connection(req, traces, timeout)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/connector.py", line 901, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/connector.py", line 1206, in _create_direct_connection
    raise last_exc
  File "/usr/local/lib/python3.11/site-packages/aiohttp/connector.py", line 1175, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/connector.py", line 988, in _wrap_create_connection
    raise client_error(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host localstack:4510 ssl:default [Connect call failed ('192.168.207.2', 4510)]
Unclosed client session
client_session: <aiohttp.client.ClientSession object at 0xffffb52f9110>

Is this a limitation of AWS Neptune in Localstack right now?

Hi — Please look into our Persistence coverage docs here: Persistence Coverage for AWS Services | Docs

Neptune isn’t supported at the moment.

Ah I see. Is that something you’re already working on at the moment or you don’t know yet when it’ll be implemented? Either way, is there a way for me to only enable persistence for some services? Like SQS, DynamoDB and have it off for AWS Neptune?

Hi — We are working on improving the persistence support. I will update the thread once we have valid persistence tests for Neptune.

Persistence for other series are available. There is no way to turn it off for Neptune though. I would suggest to use an Init hook to create a Neptune cluster at the start while having LocalStack restore the state of other services.