Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to reuse the underlying TCP connections instead of recreating each time #1239

Closed
tim-chow opened this issue Dec 26, 2024 · 1 comment
Closed

Comments

@tim-chow
Copy link

tim-chow commented Dec 26, 2024

In high-concurrency scenarios, frequently creating and destroying TCP connections can have a performance impact. Especially when using HTTPS, a TLS handshake must also be performed. Therefore, I hope to find a way to reuse the underlying TCP connections, and this method must be concurrency-safe.

In the current usage method:

    session = get_session()
    async with session.create_client('s3', region_name='us-west-2',
                                   aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
                                   aws_access_key_id=AWS_ACCESS_KEY_ID) as client:
        # upload object to amazon s3
        data = b'\x01'*1024
        resp = await client.put_object(Bucket=bucket,
                                            Key=key,
                                            Body=data)
        print(resp)

This means that every time a file is uploaded, a client needs to be created through the context manager, and the client is automatically destroyed when the context ends. The underlying http_session of the client is responsible for maintaining TCP connections, and when the client context is exited, these connections are also destroyed.

If I keep the client's context until the program is closed. As shown below:

from aiobotocore.session import get_session

AWS_ACCESS_KEY_ID = "xxx"
AWS_SECRET_ACCESS_KEY = "xxx"


class S3Manager:
    def __init__(self):
        self._session = get_session()
        self._client = None

    async def initialize(self):
        self._client = await self._session.create_client(
            's3', region_name='us-west-2',
            aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
            aws_access_key_id=AWS_ACCESS_KEY_ID
        ).__aenter__()

    async def dispose(self):
        await self._client.__aexit__(None, None, None)

    async def put_object(self, bucket, key, data):
        return await self._client.put_object(
            Bucket=bucket,
            Key=key,
            Body=data,
        )

My concern is whether the client object is concurrency-safe. At least I think that the AIOHTTPSession in the source code is not coroutine-safe.

@thehesiod
Copy link
Collaborator

a few comments:

  1. you don't need to store the session
  2. what are you wrapping a client around another effective context manager, it's already a context manager, just use it as is.
  3. Yes clients are concurrency safe, that's how they're supposed to be used. A client instance maps to an aiohttp session, which contains a pool of connections which you can control via botocore Config max_pool_connections

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants