Description
Hi,
First of all, thanks for this excellent library!
Intro
Just to notify about an issue that might be of interest for those using the projections part of boost geometry,
limited to the epsg projection engine.
To be clear: This is not an issue about the correctness of the code, rather practical implications regarding excessive compile times.
The function in question is the boost::geometry::projections::detail::epsg_to_parameters(), that does the job of converting an epsg code to parameters that describes the transformation.
E.g. given epsg code (an int) return back boost::geometry::srs::parameters<>.
This is realized using a static array inside the mentioned function, declarative and clear to read.
the issue
The only issue is the time that the compiler need in order to resolve the expressions in the static table,
chained call,
starting with parameters<>(projection)(param1)....(param1) // the call operator is overloaded
Compile times for pre gcc 12.1 was already high (several minutes),
but with the new gcc 12.1 it becomes > 30 minutes (33 minutes on ryzen 9 5950X series, only one core used).
On godbolt, the mininal sample below works for pre 12.1 compilers, but time-out on 12.1.
#include <boost/geometry/srs/projections/dpar.hpp>
#include <boost/geometry/srs/projections/epsg.hpp>
namespace test {
size_t spend_massive_compile_time(int epsg_code) {
auto dp = boost::geometry::projections::detail::epsg_to_parameters(epsg_code);
return dp.size();
}
}
int main(int argc, char **argv) {
return test::spend_massive_compile_time(32631); // should return 6
}
Possible solutions/workarounds.
Simplify the expressions in the table
I tried to rework the static table, trying to avoid using chained expressions,
rather use use plain vector<parameter> {}.
It was kind of promising, 10 times faster!
- so 3 minutes instead of 30 minutes.
Looking at the data that goes into the static table, it's basically code,'numbers', strongly typed,
so moving along in a direction where the data that drives the resulting parameters are simple/fast compile-time types,
it might be possible to get down to 'zero' compile-time,
- at the cost of some run-time. (because the construction of the wanted object must be done at runtime)
E.g. creating the parameters only when really asked for,
by means of typed constants/tuples, that are cheap during compile time.
A local test with initial approach, using the same table definitions,
reduces the compile time from 33 minutes to 3 minutes. (promising)
So it illustrates that it works, even though 3 minutes to compile a table of approx 5k rows is still excessive.
Further progress on this shows that compiling a simple POD approach,
takes only 0.8 seconds. (compared to 33 minutes gcc 12.1, or 3 minutes pre gcc 12).
So approx 2000 times faster than the current situation for gcc 12.1 (and any boost geometry version, is my guess).
I think with a little work on this approach it might be worth considering,
unless there are other more obvious solutions around that I have overlooked.
Sketches for this approach looks like:
#include <boost/geometry/srs/projections/dpar.hpp>
namespace boost::geometry::projections::detail {
namespace epsg_data {
/**
* @brief simplest possible pod to capture a parameter definition
*/
struct epsg_p_def {
int e_type;///< unique enum type, one of name_xxx enums.
int e_val{0}; ///< sub-enum type, one of value_xxx enums
int n_args{0};///< number of floating point args, dependent on e_type,e_val
double args[7];//most parameters are 1, towgs84 3 or 7
};
/**
* @brief pod to keep one epsg def, a set of epsg_p_defs.
*/
struct epsg_def {
epsg_p_def p_defs[13];// currently number of p_defs are 13(!)
};
struct epsg_entry {
int code;
epsg_def parameters;
};
using namespace srs::dpar;
static const epsg_entry arr[] = {
{2000, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{units,units_m},{no_defs}}}},
{2001, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{units,units_m},{no_defs}}}},
{2002, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{towgs84,0,7,{725,685,536,0,0,0,0}},{units,units_m},{no_defs}}}},
{2003, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{towgs84,0,7,{72,213.7,93,0,0,0,0}},{units,units_m},{no_defs}}}},
{2004, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{towgs84,0,7,{174,359,365,0,0,0,0}},{units,units_m},{no_defs}}}},
{2005, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{units,units_m},{no_defs}}}},
// some more thousands rows follows here
};
}
}
Make the table/instantiation part of a .cpp file
If the table was created as part of the compilation of the boost library itself, instead of header-only for this part, that might work.
Maybe in combination with the speedup approach, 30 minutes is still some time, for a single file.
A variant is to control the inline/static part with a compile-time define, so that affected users can choose to compile the table-part in a separate unit (avoiding repeated compilations of the stuff that takes time)
Drawbacks with this approach is clearly that it breaks header-only, so the 'hybrid' user def constant specified approach could work, helping out the users that are affected.