Proof of concept of a reader-writer split (#2761) by bal-e · Pull Request #2795 · rustls/rustls

bal-e · 2025-12-05T10:41:32Z

This is a proof of concept adding module rustls::split, an alternative (currently buffered) API which splits a post-handshake TLS connection into a Reader and Writer. It demonstrates a possible API and implementation for #2761.

The API

The split() function can split a TLS 1.2/1.3 client/server connection with a completed handshake into (Reader, Writer). These types can be used independently, e.g. from different threads or asynchronous tasks. It is sometimes necessary to communicate between them (e.g. to send alerts to a misbehaving peer, or if the peer requests we update our encryption keys). But this happens quite rarely, and the user can implement that communication using their own synchronization mechanisms (e.g. mutexes or channels).

Reader implements the reader-side functionality of ConnectionCommon. It offers recv_tls() to read from the underlying stream (equivalent to read_tls() + process_new_packets()) and reader() to access buffered plaintext data. The most important difference is that recv_tls() can return a WriterAction, which the user must communicate to the Writer.

Writer implements the writer-side functionality of ConnectionCommon. It offers send_tls() (i.e. write_tls()) and writer(). In addition, it has an enact() method to consume a WriterAction from the reader.

Justification

Splitting the reader and writer can make life a lot easier for request-response protocols. Even though TLS is an ordered stream protocol, the read and write halves of a connection are not ordered wrt each other. Instead of sending a single request, waiting for a response, and then repeating, implementations can write as many requests as I/O allows and wait for the responses to arrive. In theory, this could significantly improve throughput, especially for connections and requests with high latency; but testing it with rustls has been difficult.

With this API, it is possible that WriterActions are somewhat delayed. Is this a correctness issue? I would argue not. A TLS peer's delayed response to a message could occur because of a delayed WriterAction, or simply delays in the network; the two are indistinguishable. As long as WriterActions are processed relatively quickly, and sent to the right Writer in the right order, it will be fine. The improvement in throughput (which can now be benchmarked, with this PR) is adequate justification in my eyes.

Defensive Measures

Firstly, I want to emphasize the low-level nature of this API. Even with the unbuffered API we have today, implementation error on the user's part can lead to problems. The same is true here. We can still implement defensive measures to mitigate this, but an impossible-to-misuse API is unnecessary and unwarranted.

It is important to ensure that all WriterActions are sent to the right Writer in the right order. I am ignoring timeliness.

I have already implemented some defensive measures with #[must_use] annotations on Received and WriterAction; these mitigate the possibility of the WriterAction being dropped or ignored.
To ensure ordering, WriterActions could include a simple monotonic counter.
Ensuring that they reach the right Writer is more difficult; one way would be to maintain an Arc internally (as the current code does), include it in the WriterAction, and test Arc::ptr_eq.

Implementation

The rustls::split module inlines the state and methods of ClientConnection / ServerConnection, ConnectionCommon, ConnectionCore, Box, CommonState, and RecordLayer. These types combined reader and writer state and could not be used; I chose to duplicate their code rather than to break the existing codebase. The only changes to them are to make some items pub(crate)` so that I can access them.

This has resulted in a 1500-line module that combines all the important post-handshake TLS logic. It's not pretty, but it does have some benefits; every step in the process is visible at a glance and it's easier to simplify control flow. IMO, in-place encryption/decryption would be much easier to implement on this, than on the existing code; #2758 would not have been hindered the way it was.

It should be possible to refactor this module and split up its data into smaller types again; but I suspect ConnectionCommon etc. are not the ideal layout. But I don't want to put too much more effort into this right now; it's good enough to judge the API concretely and to benchmark. I don't want to waste my time if you (the maintainers) think a different approach is necessary.

Testing

Some basic initial testing shows that this code works! I was able to modify simpleclient and I can happily talk to rust-lang.org. I'll try to find tests that focus on post-handshake state, and perhaps look at BoGo.

This code needs to be integrated with rustls' benchmarking suite. I need to investigate that and figure out how to measure the maximum throughput of a request-response connection, with and without the reader-writer split. Suggestions for alternative benchmarking approaches are welcome.

Future Direction

If this approach seems feasible, I would like to make this implementation unbuffered. Now that all of the different layers of TLS logic are in one place, it's much easier to implement in-place encryption/decryption and wipe out all allocations from the hot path.

The code is only applicable to connections that have finished their handshakes. The handshaking process is significantly more complicated and would be incredibly difficult to implement like this. The core idea behind this PR is that readers and writers don't communicate that much; but that's not true during a handshake. I don't know if or how ClientConnection etc. can be implemented in terms of this code. Still, this module is already useful for users of long-lived connections; they can trade extra handshaking work for better post-handshake performance.

The code currently lacks support for TLS 1.3 session tickets. This made my life way easier, as I could then define Reader and Writer as simple concrete types. Session ticket support will require distinguishing client and server types, as well as TLS 1.2 and TLS 1.3 use. It doesn't sound fun, but it will be necessary.

Commit History / Review

I've left the commit history as-is for now. I'm happy to try to clean it up if merging this PR is on the table. While the overall diff is +1508 -40, the sum of diffs across commits is +2712 -1234; a lot of refactoring happened in-place. It may be possible to review the commits in order, but I'm sure that won't be fun. I've tried to mark down where functions come from (e.g. ConnectionCore vs CommonState), but it might not always be clear. A lot of QUIC and handshake related logic have also been removed.

Sorry about the huge chunk of code to review. I know y'all have limited time. I would suggest prioritizing the API and judging whether this design is feasible. We can talk further once I have some benchmarks.

This provides a simple, buffered API that uses locking internally. The goal is to incrementally strip away layers from the 'ClientConnection', until the lock can be removed.

This introduces the reader-to-writer communication we expect, for passing forward 'sendable_plaintext'. At the moment, I haven't implemented any defensive measures to ensure 'WriterActions' are used. NOTE: The code currently contains a logic error, similar to <rustls#2758>. If the traffic keys are refreshed, the key update message is accidentally sent with the new keys. This will get resolved with enough destructuring.

Turns out it's not used in post-handshake 'State's.

This is a specialization of 'State::handle()' for 'conn::split'.

The common case (application data) is now handled outside the trait object. 'Context::receive_payload()' is also inlined, making it easier to eliminate it.

Now that the 'SideData' type isn't necessary, all the existing code is adapted to work for both client and server sides. This also required splitting out the new 'State' methods into a separate trait.

'ConnectionInfo' holds the immutable parts of 'CommonState'.

In particular: - 'has_seen_eof' - 'has_received_close_notify' - 'received_plaintext'

djc · 2025-12-08T11:52:16Z

Thanks for working on this. I haven't looked at it in much detail, but here are some initial responses:

I think we should support this use case with minimal overhead in a future release.
I don't think we want to add ~1500 LOC just to support this case.
This is a bit of an awkward moment as we're deep into extracting a smaller core from the library that would potentially make this kind of thing easier, but I don't think it's the right time yet to be thinking about this aspect.

We've considered the separate unbuffered API to be a mistake, since it adds a whole bunch of parallel API which makes it difficult/tedious to make sure it has the same level of quality as the "main" API. As such, we're working on refactoring that will make the core library more modular, which I think will make it easier to slot in a reader/writer split.

ctz · 2025-12-09T09:00:42Z

Thanks for working on this.

Yes, thanks for sketching this out. I think the end result is substantially OK. I think the WriterAction is alright for low-assurance items, or places where the protocol allows some delay (like KeyUpdate). If we need anything with higher assurance there are some other options (eg, shared mutable state hidden within the objects, given care that there is no common-case contention). For alerts I believe just a single atomic would be sufficient and allow us to meet the requirements of 6.2

bal-e · 2025-12-09T10:48:31Z

Thanks for having a look! I'm glad to hear that such a use case is considered important.

I don't think we want to add ~1500 LOC just to support this case.

I agree that the module can't be merged in anything close to its current state. But a nice way forward could be to implement Connection in terms of this module. That could be taken further, by removing all buffering from this module and also using it to implement UnbufferedConnection.

This is a bit of an awkward moment as we're deep into extracting a smaller core from the library that would potentially make this kind of thing easier, but I don't think it's the right time yet to be thinking about this aspect.

Ah, that's unfortunate. I think the ideal way to merge this API is to use it as the basis for the existing Connection / UnbufferedConnection APIs, but that would clearly require larger refactoring. I'm happy to try tackling that -- working on this has been fun, and I think I have an idea for how to approach it. However:

I'm new and you have good reason not to rely on me for it.
I don't know the progress of your refactoring work, and whether you want to this API before it.
Redundant work is a waste of time for everyone.

I can see three paths forward:

We can postpone this work and you can continue the core refactoring. Hopefully this PR helps guide the design for the new core, so it is amenable to this addition in the future.
I can clean up this PR and try to implement Connection in terms of it. AFAICT that would be the fastest acceptable way to get this API merged. But Connection has to deal with handshakes, and I'm worried that would cause difficulties.
Depending on the progress of your refactoring, I can try to help out there, and make sure that this use case is nicely satisfied. I'm also happy to generally help with optimizations there, particularly towards trimming allocations and trait objects.

What would you like (me) to do?

djc · 2025-12-09T13:37:08Z

I can see three paths forward:

We can postpone this work and you can continue the core refactoring. Hopefully this PR helps guide the design for the new core, so it is amenable to this addition in the future.

I can clean up this PR and try to implement Connection in terms of it. AFAICT that would be the fastest acceptable way to get this API merged. But Connection has to deal with handshakes, and I'm worried that would cause difficulties.

Depending on the progress of your refactoring, I can try to help out there, and make sure that this use case is nicely satisfied. I'm also happy to generally help with optimizations there, particularly towards trimming allocations and trait objects.

What would you like (me) to do?

I think either 1 or 3 could make sense depending on how much context transfer we think is needed to have you help out. @ctz I think you're currently working on this direction AIUI, do you think there's a meaningful chunk of work that @bal-e could shave off?

arya dradjica added 30 commits December 5, 2025 11:17

Add module 'rustls::conn::split'

eeaecd7

[conn/split] Add a simple locking implementation

dd3497a

This provides a simple, buffered API that uses locking internally. The goal is to incrementally strip away layers from the 'ClientConnection', until the lock can be removed.

[conn/split] Destructure 'ClientConnection'

6db652f

[conn/split] Destructure 'ConnectionCore'

b0e04db

[conn/split] Minor cleanup

6624df9

[conn/split] Ignore 'sendable_plaintext' in 'Reader'

7b54bd7

Turns out it's not used in post-handshake 'State's.

[conn/split] Disallow QUIC connections

b493a0d

[common_state] Add 'State::handle_for_split_traffic()'

4a31e04

This is a specialization of 'State::handle()' for 'conn::split'.

[conn/split] Duplicate 'CommonState::process_main_protocol()'

ae4cac3

[conn/split] Destructure 'State::handle()'

5056e56

The common case (application data) is now handled outside the trait object. 'Context::receive_payload()' is also inlined, making it easier to eliminate it.

[conn/split] Destructure 'Context'

34dc0d9

[conn/split] Support both client and server sides

48705d3

Now that the 'SideData' type isn't necessary, all the existing code is adapted to work for both client and server sides. This also required splitting out the new 'State' methods into a separate trait.

[conn/split] Push fatal alerts to the writer

2e7cd54

[conn/split] Guard state functions

6b52407

[conn/split] Duplicate 'CommonState::process_alert()'

b0b623f

[conn/split] Introduce 'ConnectionInfo'

497ed0f

'ConnectionInfo' holds the immutable parts of 'CommonState'.

[conn/split] Push warning alerts to the writer

14514f8

[conn/split] Return 'received_plaintext' directly

785478f

[conn/split] Elide 'PlaintextSink'

201ed74

[conn/split] Push key updates to the writer

5d06047

[conn/split] Split basic reader state from 'CommonState'

6fc19bf

In particular: - 'has_seen_eof' - 'has_received_close_notify' - 'received_plaintext'

[conn/split] Remove support for session tickets

f8048b0

[conn/split] Split 'CommonState::temper_counters'

7a47ce4

[conn/split] Make 'ConnectionInfo' public

7290ff7

[conn/split] Push close-notify alerts to the writer

9a8a10a

[conn/split] Destructure 'TrafficState'

a73f912

[conn/split] Inline all non-encryption 'CommonState' methods

b7ecdcf

[conn/split] Inline 'CommonState' encryption methods

a22332b

[conn/split] Destructure 'CommonState'

7227fcd

arya dradjica added 4 commits December 5, 2025 11:17

[conn/split] Make 'Received' a '#[must_use]' enum

6cfe79a

[conn/split] Destructure 'RecordLayer'

e2b90fb

[conn/split] Eliminate 'queued_keys_update_message'

0fd8508

[conn/split] Inline 'buffer_plaintext()'

6ae50a7

bal-e mentioned this pull request Dec 5, 2025

Consider a reader-writer split for the unbuffered API #2761

Open

1 task

ctz mentioned this pull request Dec 15, 2025

Tidy record layer #2807

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proof of concept of a reader-writer split (#2761)#2795

Proof of concept of a reader-writer split (#2761)#2795
bal-e wants to merge 34 commits intorustls:mainfrom
bal-e:split-reader-writer-api

bal-e commented Dec 5, 2025

Uh oh!

djc commented Dec 8, 2025

Uh oh!

ctz commented Dec 9, 2025

Uh oh!

bal-e commented Dec 9, 2025

Uh oh!

djc commented Dec 9, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

Uh oh!

Conversation

bal-e commented Dec 5, 2025

The API

Justification

Defensive Measures

Implementation

Testing

Future Direction

Commit History / Review

Uh oh!

djc commented Dec 8, 2025

Uh oh!

ctz commented Dec 9, 2025

Uh oh!

bal-e commented Dec 9, 2025

Uh oh!

djc commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pFad - (p)hone/(F)rame/(a)nonymizer/(d)eclutterfier! Saves Data!

djc commented Dec 9, 2025 •

edited

Loading