Skip to content

Add InputStreamSubscriber.transferTo(OutputStreamPublisher) optimization #4083

Closed
@fanyang01

Description

@fanyang01

Describe the feature

TL;DR: Making InputStreamSubscriber implement a specialized InputStream#transferTo(OutputStream out) method that detects whether out is an OutputStreamPublisher and directly passes the ByteBuffers to it if so (instead of copying them).

Currently, InputStreamSubscriber inherits the transferTo(OutputStream out) method from java.io.InputStream, which uses an internal buffer to copy data. It works with usual synchronous output streams, such as System.out or ByteArrayOutputStream, but not with asynchronous ones such as OutputStreamPublisher, which is used by BlockingOutputStreamAsyncRequestBody. The reason is that OutputStreamPublisher#write(byte[]) does not copy the buffer, but just passes it to the subscriber via ByteBuffer#wrap(byte[]). Therefore, the buffer may be modified by subsequent InputStream#read(byte[] buffer) before the subscriber reads it, which leads to unexpected behavior.

To avoid this issue, the simplest solution is to do something like out.write(in.readAllBytes()). This is memory-consuming for a long input stream.

Use Case

I want to open an S3 object as an InputStream in, read a header part of it, modify something, and then write the modified header plus the remaining unchanged content to a new S3 object using the putObject(BlockingOutputStreamAsyncRequestBody) interface. It is straightforward to use in.transferTo(out) to copy the unchanged content. But currently, I have to use out.write(in.readAllBytes()) instead for correctness.

That is to say, one has to be very careful when using BlockingOutputStreamAsyncRequestBody. I encountered some cases that the data written to S3 was malformed due to this issue. Unfortunately, such behavior is not well-documented.

Implementing this feature will make such use cases safer and more performant.

Proposed Solution

No response

Other Information

No response

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

AWS Java SDK version used

2.20.19

JDK version used

OpenJDK 64-Bit Server VM Corretto-17.0.3.6.1

Operating System and version

Darwin Kernel Version 22.5.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugThis issue is a bug.p2This is a standard priority issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions