-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up Relate for PreparedGeometry #1317
base: main
Are you sure you want to change the base?
Conversation
== Bench $ cargo bench --bench prepared_geometry -- --baseline=main Compiling geo v0.29.3 (/Users/mkirk/src/georust/geo/geo) Compiling jts-test-runner v0.1.0 (/Users/mkirk/src/georust/geo/jts-test-runner) Finished `bench` profile [optimized] target(s) in 4.65s Running benches/prepared_geometry.rs (target/release/deps/prepared_geometry-b11c776cecdacd91) Gnuplot not found, using plotters backend relate prepared polygons time: [32.210 ms 32.290 ms 32.383 ms] change: [-33.711% -33.548% -33.350%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe Benchmarking relate unprepared polygons: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 80.0s, or reduce sample count to 10. relate unprepared polygons time: [799.79 ms 801.23 ms 802.67 ms] change: [-1.5726% -1.2283% -0.9119%] (p = 0.00 < 0.05) Change within noise threshold. Found 2 outliers among 100 measurements (2.00%) 2 (2.00%) high mild
…ameters Is this what progress looks like? 😭
== Bench NOTE: this is vs. *main*, so the prepared_geometry bench was already -33.548% faster vs main. $ cargo bench --bench prepared_geometry -- --baseline=main Compiling geo v0.29.3 (/Users/mkirk/src/georust/geo/geo) Compiling jts-test-runner v0.1.0 (/Users/mkirk/src/georust/geo/jts-test-runner) Finished `bench` profile [optimized] target(s) in 6.35s Running benches/prepared_geometry.rs (target/release/deps/prepared_geometry-b11c776cecdacd91) Gnuplot not found, using plotters backend relate prepared polygons time: [25.355 ms 25.456 ms 25.562 ms] change: [-47.861% -47.612% -47.365%] (p = 0.00 < 0.05) Performance has improved. Found 3 outliers among 100 measurements (3.00%) 3 (3.00%) high mild Benchmarking relate unprepared polygons: Warming up for 3.0000 s Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 82.0s, or reduce sample count to 10. relate unprepared polygons time: [761.98 ms 764.63 ms 767.56 ms] change: [-6.1618% -5.7402% -5.3188%] (p = 0.00 < 0.05) Performance has improved. Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe
where | ||
Self: Sized, | ||
{ | ||
RelateOperation::new(self, other).compute_intersection_matrix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the crux of the change - rather than passing in the geometry_graph
, we pass in the thing that builds the geometry_graph (which could be a Geometry
or a PreparedGeometry
).
e1c86d5
to
509ea49
Compare
@gauteh you might be interested in trying this out. |
use crate::Intersects; | ||
match ( | ||
self.graph_a.geometry().bounding_rect(), | ||
self.graph_b.geometry().bounding_rect(), | ||
self.geometry_a.bounding_rect().into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously: self.graph_a.geometry()
is the (unprepared) GeometryCow
, so we'd do a bbox calculation here.
Now self.geometry_a
can be a PreparedGeometry
, which has a cached bounding_rect. Or if you call Relate
with an unprepared geometry, it will continue to do the old dynamic bbox calculation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's an impressive speedup, a clear win! A few questions about the code, likely just because this is an unfamiliar codebase
@@ -108,11 +108,11 @@ impl<F: GeoFloat> LabeledEdgeEndBundleStar<F> { | |||
let left_position = label.position(geom_index, Direction::Left); | |||
let right_position = label.position(geom_index, Direction::Right); | |||
|
|||
if let Some(right_position) = right_position { | |||
if let Some(_right_position) = right_position { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on here? I thought leading underscore was convention for "allow this unused variable", but it gets used below. Is this just because the debug assertion gets compiled out sometimes and you get an error otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right. Specifically it's only used in a debug assert, which gets compiled out in release builds, thus producing an "unused" warning.
let r_tree = geometry_graph.build_tree(); | ||
|
||
let envelope = r_tree.root().envelope(); | ||
let bounding_rect: Option<Rect<F>> = if envelope.lower().x > envelope.upper().x { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or lower().y > upper().y
? Why is only one of these checks sufficient? Or is this just looking for the case when envelope()
returns lower == upper?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. This definitely warrants a comment in the code, I'll add something like this:
rstar::AABB
envelope implementation has a lower
point: (min_x, min_y) and an upper
point (max_x, max_y).
But rstar needs to be able to represent an empty envelopes (e.g. the envelope of a POLYGON EMPTY
). Instead of doing something explicit like Option<Envelope>
, the rstar implementation returns an "invalid" envelope where lower
has the max value and upper
has the min values.
fn new_empty() -> Self {
let max = P::Scalar::max_value();
let min = P::Scalar::min_value();
Self {
lower: P::from_value(max),
upper: P::from_value(min),
}
}
This empty envelope represents the only time this invariant is broken, so if one of the dimensions (x
) is invalid, the other is too. It would be sufficient to check either variable. We could check both, but it would be redundant in all cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth a fn is_empty()
if many places check for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the only place we're checking AFAIK. envelope.is_empty()
seems like a nice addition for rstar. I'm not super active in that repo, but I'll add a comment for now, and then propose this change upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I've come up with a clearer way to write this. Take another look @dabreegster eee42d7
I still think the is_empty
might be worth adding, but it seems less urgent.
Updated: opened PR here: georust/rstar#190
/// | ||
/// # Params | ||
/// | ||
/// `idx`: 0 or 1, designating A or B (respectively) in the role this geometry plays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guessing it's overkill to make this more typesafe with an enum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically this is a reflection of the reference implementation — a named enum would definitely be less cryptic.
I think we could do it, but it would be kind of a broad change - ultimately it's used throughout the geometry graph code as an index of various [T; 2]
.
Since it doesn't represent new complexity in this PR (I've only added documentation here), I'd prefer not to do that refactor in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, not suggesting a change. Thanks for the explanation
7703ff8
to
eee42d7
Compare
eee42d7
to
5b5bbb2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
CHANGES.md
if knowledge of this change could be valuable to users.Primarily: We cache the
bounding_rect
computation on thePreparedGeometry
, which gets us most of our perf gain.Secondarily: I realized we can skip building the geometry graph if the bbox check fails (which is a modest speedup for both prepared and unprepared geometries).