Skip to content

Commit 27c8e5e

Browse files
committed
Include struct layout in callbacks.
This PR reports the memory layout information (size, alignment, packed-ness) for each struct into the ParseCallbacks. The use-case for this is fairly complex. Postprocessors such as autocxx currently do not necessarily pass all bindgen's output through: autocxx is opinionated in deciding what C++ types are "reasonable" to expose to Rust. For example types containing C++ strings aren't represented to Rust by-value. Instead, these types are represented as "opaque" types in the cxx sense (https://cxx.rs/extern-c++.html#opaque-c-types). As far as Rust is concerned, these types have no fields and can only be poked at using methods. In order to make these allocable on the stack, autocxx _does_ need to know the layout of these structs, so it can make an "opaque" fake struct in lieu of the real one. That's what is enabled by this callback. autocxx and other tools can post-process the bindgen output to replace bindgen's generated type with an opaque data-only struct like this: #[repr(align(16),packed)] pub struct TheType { _data: UnsafeCell<MaybeUninit<[u8;4]>> // etc. } // the actual struct is quite a bit more complex The extra information provided in the callback within this PR allows that. For completeness, there are multiple ways that the whole stack could be rearchitected to avoid the need for this. 1. We could teach bindgen all the rules for when a struct should be "opaque", then bindgen could generate it that way in the first place. This suggests major extensibility of bindgen to allow highly opinionated backends, along the lines of #2943. 2. That decision could be delegated to the ParseCallbacks with a new fn make_struct_opaque(&self, some_struct_info) -> bool call. Although this sounds simple, the decisions required here are very complex and depend (for instance) upon all the different constructors, ability to represent template params, etc., so in practice this would require post-processors to run bindgen twice, once to gather all that information and then again to give truthful answers to that callback. 3. Post-processors such as autocxx could instead generate opaque types like this: #[repr(transparent)] pub struct TheType { _hidden_field: UnsafeCell<MaybeUninit<TheTypeAsGeneratedByBindgen>> } // it's more complex in reality or similar. This turns out to be very complex because TheTypeAsGeneratedByBindgen might include complex STL structures such as std::string which autocxx currently instructs bindgen to replace with a simpler type, precisely so it can avoid all this complexity. Any of these options are months of work either in bindgen or autocxx or both, so I'm hoping that it's OK to simply emit the layout information which bindgen already knows. Compatibility: this is a further compatibility break to the DiscoveredItem enum. However, it seems to be accepted that we're going to grow that enum over time. If the compatibility break is a concern this information could be reported via a separate callback. Part of google/autocxx#124
1 parent 0df4256 commit 27c8e5e

File tree

4 files changed

+38
-8
lines changed

4 files changed

+38
-8
lines changed

bindgen-tests/tests/parse_callbacks/item_discovery_callback/mod.rs

+27-3
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,9 @@ use std::rc::Rc;
44

55
use regex::Regex;
66

7-
use bindgen::callbacks::{DiscoveredItem, DiscoveredItemId, ParseCallbacks};
7+
use bindgen::callbacks::{
8+
DiscoveredItem, DiscoveredItemId, Layout, ParseCallbacks,
9+
};
810
use bindgen::Builder;
911

1012
#[derive(Debug, Default)]
@@ -37,6 +39,11 @@ pub fn test_item_discovery_callback() {
3739
DiscoveredItem::Struct {
3840
original_name: Some("NamedStruct".to_string()),
3941
final_name: "NamedStruct".to_string(),
42+
layout: Some(Layout {
43+
size: 0,
44+
align: 1,
45+
packed: false,
46+
}),
4047
},
4148
),
4249
(
@@ -78,6 +85,11 @@ pub fn test_item_discovery_callback() {
7885
DiscoveredItem::Struct {
7986
original_name: None,
8087
final_name: "_bindgen_ty_*".to_string(),
88+
layout: Some(Layout {
89+
size: 0,
90+
align: 1,
91+
packed: false,
92+
}),
8193
},
8294
),
8395
(
@@ -122,7 +134,7 @@ fn compare_item_info(
122134
) -> bool {
123135
if std::mem::discriminant(expected_item) !=
124136
std::mem::discriminant(generated_item)
125-
{
137+
{
126138
return false;
127139
}
128140

@@ -160,6 +172,7 @@ pub fn compare_struct_info(
160172
let DiscoveredItem::Struct {
161173
original_name: expected_original_name,
162174
final_name: expected_final_name,
175+
layout: expected_layout,
163176
} = expected_item
164177
else {
165178
unreachable!()
@@ -168,6 +181,7 @@ pub fn compare_struct_info(
168181
let DiscoveredItem::Struct {
169182
original_name: generated_original_name,
170183
final_name: generated_final_name,
184+
layout: generated_layout,
171185
} = generated_item
172186
else {
173187
unreachable!()
@@ -177,12 +191,22 @@ pub fn compare_struct_info(
177191
return false;
178192
}
179193

180-
match (expected_original_name, generated_original_name) {
194+
if !match (expected_original_name, generated_original_name) {
181195
(None, None) => true,
182196
(Some(expected_original_name), Some(generated_original_name)) => {
183197
compare_names(expected_original_name, generated_original_name)
184198
}
185199
_ => false,
200+
} {
201+
return false;
202+
}
203+
204+
match (expected_layout, generated_layout) {
205+
(None, None) => true,
206+
(Some(expected_layout), Some(actual_layout)) => {
207+
expected_layout == actual_layout
208+
}
209+
_ => false,
186210
}
187211
}
188212

bindgen/callbacks.rs

+5
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ pub use crate::ir::analysis::DeriveTrait;
44
pub use crate::ir::derive::CanDerive as ImplementsTrait;
55
pub use crate::ir::enum_ty::{EnumVariantCustomBehavior, EnumVariantValue};
66
pub use crate::ir::int::IntKind;
7+
pub use crate::ir::layout::Layout;
78
use std::fmt;
89

910
/// An enum to allow ignoring parsing of macros.
@@ -192,6 +193,10 @@ pub enum DiscoveredItem {
192193

193194
/// The name of the generated binding
194195
final_name: String,
196+
197+
/// The layout of the structure in memory (size, alignment, etc.) if
198+
/// known.
199+
layout: Option<Layout>,
195200
},
196201

197202
/// Represents a union with its original name in C and its generated binding name

bindgen/codegen/mod.rs

+1
Original file line numberDiff line numberDiff line change
@@ -2490,6 +2490,7 @@ impl CodeGenerator for CompInfo {
24902490
.name()
24912491
.map(String::from),
24922492
final_name: canonical_ident.to_string(),
2493+
layout,
24932494
},
24942495
CompKind::Union => DiscoveredItem::Union {
24952496
original_name: item

bindgen/ir/layout.rs

+5-5
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,14 @@ use crate::ir::context::BindgenContext;
77
use std::cmp;
88

99
/// A type that represents the struct layout of a type.
10-
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
11-
pub(crate) struct Layout {
10+
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Ord, PartialOrd)]
11+
pub struct Layout {
1212
/// The size (in bytes) of this layout.
13-
pub(crate) size: usize,
13+
pub size: usize,
1414
/// The alignment (in bytes) of this layout.
15-
pub(crate) align: usize,
15+
pub align: usize,
1616
/// Whether this layout's members are packed or not.
17-
pub(crate) packed: bool,
17+
pub packed: bool,
1818
}
1919

2020
#[test]

0 commit comments

Comments
 (0)