|
| 1 | +- Feature Name: safe-deref |
| 2 | +- Start Date: 2017-01-26 |
| 3 | +- RFC PR: |
| 4 | +- Rust Issue: |
| 5 | + |
| 6 | +# Summary |
| 7 | +[summary]: #summary |
| 8 | + |
| 9 | +Add `SafeDeref` and `SafeDerefMut` trait to the standard library, equivalent to `Deref` and |
| 10 | +`DerefMut` but which are guaranteed to always return the same object. |
| 11 | + |
| 12 | +# Motivation |
| 13 | +[motivation]: #motivation |
| 14 | + |
| 15 | +Imagine a C API that looks like this: |
| 16 | + |
| 17 | +```c |
| 18 | +// Creates a connection to a database. |
| 19 | +connection_t* create_connection(); |
| 20 | +// Destroys a connection. You must destroy all queries before destroying a connection. |
| 21 | +void delete_connection(connection_t*); |
| 22 | + |
| 23 | +// Creates a new SQL query using the connection. Returns the ID of the query. |
| 24 | +int create_query(connection_t*, const str*); |
| 25 | +// Destroys a query. The query id must be valid for this connection. |
| 26 | +void delete_query(connection_t*, int); |
| 27 | +// Starts a query. The query id must be valid for this connection. |
| 28 | +void start_query(connection_t*, int); |
| 29 | +// Gets the results of the query. The query id must be valid for this connection. |
| 30 | +some_type get_query_results(connection_t*, int); |
| 31 | +``` |
| 32 | +
|
| 33 | +The usage pretty straight-forward, but take note of the comments and requirements of the API. |
| 34 | +In order to make a query you are supposed to call `create_query`, which will return an ID. In order |
| 35 | +to manipulate the query, you must then pass both the connection and the ID to the API. |
| 36 | +
|
| 37 | +One can wrap around this API like this in Rust (skipping some functions): |
| 38 | +
|
| 39 | +```rust |
| 40 | +pub struct Connection { |
| 41 | + ptr: *mut ffi::connection_t |
| 42 | +} |
| 43 | +
|
| 44 | +impl Drop for Connection { |
| 45 | + unsafe { ffi::delete_connection(self.ptr) } |
| 46 | +} |
| 47 | +
|
| 48 | +pub struct Query<'a> { |
| 49 | + connection: &'a Connection, |
| 50 | + id: i32, |
| 51 | +} |
| 52 | +
|
| 53 | +impl<'a> Query<'a> { |
| 54 | + pub fn start(&self) { |
| 55 | + unsafe { ffi::start_query(self.connection.ptr, self.id) } |
| 56 | + } |
| 57 | +} |
| 58 | +
|
| 59 | +impl<'a> Drop for Query<'a> { |
| 60 | + unsafe { ffi::delete_query(self.connection.ptr, self.id) } |
| 61 | +} |
| 62 | +``` |
| 63 | + |
| 64 | +Everything works well, and everything is safe. |
| 65 | + |
| 66 | +But after a few days someone opens an issue because for example they would like to distribute |
| 67 | +queries amongst threads and can't because of the lifetime. In order to solve the problem, you |
| 68 | +rewrite your code and change the `Query` to be generic: |
| 69 | + |
| 70 | +```rust |
| 71 | +pub struct Connection { |
| 72 | + ptr: *mut ffi::connection_t |
| 73 | +} |
| 74 | + |
| 75 | +impl Drop for Connection { |
| 76 | + unsafe { ffi::delete_connection(self.ptr) } |
| 77 | +} |
| 78 | + |
| 79 | +pub struct Query<P> where P: Deref<Target = Connection> { |
| 80 | + connection: P, |
| 81 | + id: i32, |
| 82 | +} |
| 83 | + |
| 84 | +impl<P> Query<P> where P: Deref<Target = Connection> { |
| 85 | + pub fn start(&self) { |
| 86 | + unsafe { ffi::start_query(self.connection.ptr, self.id) } |
| 87 | + } |
| 88 | +} |
| 89 | + |
| 90 | +impl<P> Drop for Query<P> where P: Deref<Target = Connection> { |
| 91 | + unsafe { ffi::delete_query(self.connection.ptr, self.id) } |
| 92 | +} |
| 93 | +``` |
| 94 | + |
| 95 | +This way the user can either use a `Query<&'a Connection>` or a `Query<Arc<Connection>>`, depending |
| 96 | +on what suits them best. |
| 97 | + |
| 98 | +Everything is fine, right? Wrong! Because objects that implement `Deref`/`DerefMut` are not |
| 99 | +guaranteed to return the same object every time. For example the user can do this: |
| 100 | + |
| 101 | +```rust |
| 102 | +pub struct MyFancyArc { |
| 103 | + a: Arc<Connection>, |
| 104 | + b: Arc<Connection>, |
| 105 | +} |
| 106 | + |
| 107 | +impl Deref for MyFancyArc { |
| 108 | + type Target = Connection; |
| 109 | + |
| 110 | + fn deref(&self) -> &Connection { |
| 111 | + if rand::random::<f32>() < 0.5 { |
| 112 | + &*self.a |
| 113 | + } else { |
| 114 | + &*self.b |
| 115 | + } |
| 116 | + } |
| 117 | +} |
| 118 | +``` |
| 119 | + |
| 120 | +And then use a `Query<MyFancyArc>`. And if they do so then the wrapper becomes unsound, because it |
| 121 | +is possible to call `delete_query` with a wrong connection/query id pair. |
| 122 | + |
| 123 | +## Solving the problem |
| 124 | + |
| 125 | +As a library writer, I can see three ways to solve this problem. |
| 126 | + |
| 127 | +The first way would be to put the `*mut connection_t` inside some sort of internal hidden `Arc` |
| 128 | +shared by the `Connection` and the `Query`. But if you do so you might as well force a |
| 129 | +`Query<Arc<Connection>>` anyway, as the benefits of using a lightweight reference disappear. |
| 130 | + |
| 131 | +The second way would be to store the `*mut connection_t` inside the `Query` struct, and before |
| 132 | +every single FFI call that uses the query we check that the `*mut connection_t` inside the |
| 133 | +`Connection` matches the `*mut connection_t` inside the `Query`. |
| 134 | + |
| 135 | +In other words, one would have to write this: |
| 136 | + |
| 137 | +```rust |
| 138 | +pub struct Query<P> where P: Deref<Target = Connection> { |
| 139 | + connection: P, |
| 140 | + connection_ptr: *mut connection_t, |
| 141 | + id: i32, |
| 142 | +} |
| 143 | + |
| 144 | +impl<P> Query<P> where P: Deref<Target = Connection> { |
| 145 | + pub fn start(&self) { |
| 146 | + let connec = self.connection.deref(); |
| 147 | + assert_eq!(connec.ptr, self.connection_ptr); |
| 148 | + unsafe { ffi::start_query(connec.ptr, self.id) } |
| 149 | + } |
| 150 | +} |
| 151 | +``` |
| 152 | + |
| 153 | +This approach has three major drawbacks: |
| 154 | + |
| 155 | +- You add a runtime overhead at every single function call. The `Query` object, which was supposed |
| 156 | + to be lightweight now performs checks that will be false most of the time anyway. Ideologically |
| 157 | + it is pretty bad to have to add a runtime check for what is a weakness of the safe/unsafe |
| 158 | + mechanism of the Rust language. |
| 159 | +- It is really painful to write and it is too easy to miss a check somewhere. |
| 160 | +- It doesn't prevent the `Connection` from being destroyed while there are queries still alive. |
| 161 | + A rogue implementation of `Deref` can choose to destroy its content at any moment thanks to a |
| 162 | + background thread for example. |
| 163 | + |
| 164 | +The third way to solve the problem, which is proposed in this RFC, is to add new traits named |
| 165 | +`SafeDeref` and `SafeDerefMut` that guarantee that they always return the same object and that |
| 166 | +their content will outlast them. |
| 167 | + |
| 168 | +# Detailed design |
| 169 | +[design]: #detailed-design |
| 170 | + |
| 171 | +Add the following traits to `std::sync`: |
| 172 | + |
| 173 | +```rust |
| 174 | +unsafe trait SafeDeref: Deref {} |
| 175 | +unsafe trait SafeDerefMut: SafeDeref + DerefMut {} |
| 176 | +``` |
| 177 | + |
| 178 | +Types that implement this trait are guaranteed to always return the same object and that their |
| 179 | +content lives at least as long as they do. |
| 180 | +Most of the implementations of `Deref`/`DerefMut` should match these criterias. |
| 181 | + |
| 182 | +Implement these traits on all the types of the standard library that already implement respectively |
| 183 | +`Deref` and `DerefMut`. |
| 184 | + |
| 185 | +# How We Teach This |
| 186 | +[how-we-teach-this]: #how-we-teach-this |
| 187 | + |
| 188 | +This feature is oriented towards people who write unsafe code, and thus doesn't need to appear |
| 189 | +in any beginner-oriented book. |
| 190 | +A warning should be added to the documentation of `Deref` and `DerefMut` and in the rustnomicon to |
| 191 | +make it clear that they shouldn't be relied upon when writing unsafe code. |
| 192 | + |
| 193 | +# Drawbacks |
| 194 | +[drawbacks]: #drawbacks |
| 195 | + |
| 196 | +The major drawback is that library writers that create their own `Deref`-implementing objects are |
| 197 | +possibly going to add an unsafe trait implementation to their code. |
| 198 | + |
| 199 | +In other words a codebase that could be entirely safe could therefore become unsafe. |
| 200 | + |
| 201 | +# Alternatives |
| 202 | +[alternatives]: #alternatives |
| 203 | + |
| 204 | +- One possible alternative is to modify the `Deref` and `DerefMut` traits directly. |
| 205 | +In the author's opinion this would be the best thing to do, however this would require adding |
| 206 | +`unsafe` to these traits and would be a breaking change. |
| 207 | + |
| 208 | +- One could also argue that C APIs that look like the motivating example are simply badly designed, |
| 209 | +and that writing a rogue implementation of `Deref` should always result in either a logic error or |
| 210 | +a panic because in the end we're only manipulating memory after all. |
| 211 | + |
| 212 | +- The prefix `Safe` is maybe not great. |
| 213 | + |
| 214 | +# Unresolved questions |
| 215 | +[unresolved]: #unresolved-questions |
| 216 | + |
| 217 | +None? |
0 commit comments