Skip to content

bg::projections::detail::epsg_to_parameters causes excessive compile times #1006

Closed
@sigbjorn

Description

@sigbjorn

Hi,

First of all, thanks for this excellent library!

Intro

Just to notify about an issue that might be of interest for those using the projections part of boost geometry,
limited to the epsg projection engine.

To be clear: This is not an issue about the correctness of the code, rather practical implications regarding excessive compile times.

The function in question is the boost::geometry::projections::detail::epsg_to_parameters(), that does the job of converting an epsg code to parameters that describes the transformation.

E.g. given epsg code (an int) return back boost::geometry::srs::parameters<>.

This is realized using a static array inside the mentioned function, declarative and clear to read.

the issue

The only issue is the time that the compiler need in order to resolve the expressions in the static table,
chained call,
starting with parameters<>(projection)(param1)....(param1) // the call operator is overloaded

Compile times for pre gcc 12.1 was already high (several minutes),
but with the new gcc 12.1 it becomes > 30 minutes (33 minutes on ryzen 9 5950X series, only one core used).

On godbolt, the mininal sample below works for pre 12.1 compilers, but time-out on 12.1.

#include <boost/geometry/srs/projections/dpar.hpp>
#include <boost/geometry/srs/projections/epsg.hpp>

namespace test {
    size_t spend_massive_compile_time(int epsg_code) {
        auto dp = boost::geometry::projections::detail::epsg_to_parameters(epsg_code);
        return dp.size();
    }
}

int main(int argc, char **argv) {
    return test::spend_massive_compile_time(32631); // should return 6
}

Possible solutions/workarounds.

Simplify the expressions in the table

I tried to rework the static table, trying to avoid using chained expressions,
rather use use plain vector<parameter> {}.

It was kind of promising, 10 times faster!

  • so 3 minutes instead of 30 minutes.

Looking at the data that goes into the static table, it's basically code,'numbers', strongly typed,
so moving along in a direction where the data that drives the resulting parameters are simple/fast compile-time types,
it might be possible to get down to 'zero' compile-time,

  • at the cost of some run-time. (because the construction of the wanted object must be done at runtime)

E.g. creating the parameters only when really asked for,
by means of typed constants/tuples, that are cheap during compile time.

A local test with initial approach, using the same table definitions,
reduces the compile time from 33 minutes to 3 minutes. (promising)
So it illustrates that it works, even though 3 minutes to compile a table of approx 5k rows is still excessive.

Further progress on this shows that compiling a simple POD approach,
takes only 0.8 seconds. (compared to 33 minutes gcc 12.1, or 3 minutes pre gcc 12).

So approx 2000 times faster than the current situation for gcc 12.1 (and any boost geometry version, is my guess).

I think with a little work on this approach it might be worth considering,
unless there are other more obvious solutions around that I have overlooked.

Sketches for this approach looks like:

#include <boost/geometry/srs/projections/dpar.hpp>


namespace boost::geometry::projections::detail {
    namespace epsg_data {
        /**
         * @brief simplest possible pod to capture a parameter definition
         */
        struct epsg_p_def {
            int e_type;///< unique enum type, one of name_xxx enums.
            int e_val{0}; ///< sub-enum type, one of value_xxx enums
            int n_args{0};///< number of floating point args, dependent on e_type,e_val
            double args[7];//most parameters are 1, towgs84 3 or 7
        };
        /**
         * @brief  pod to keep one epsg def, a set of epsg_p_defs.
         */
        struct epsg_def {
            epsg_p_def p_defs[13];// currently number of p_defs are 13(!)
        };
        struct epsg_entry {
            int code;
            epsg_def parameters;
        };
        using namespace srs::dpar;

        static const epsg_entry arr[] = {
            {2000, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{units,units_m},{no_defs}}}},
            {2001, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{units,units_m},{no_defs}}}},
            {2002, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{towgs84,0,7,{725,685,536,0,0,0,0}},{units,units_m},{no_defs}}}},
            {2003, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{towgs84,0,7,{72,213.7,93,0,0,0,0}},{units,units_m},{no_defs}}}},
            {2004, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{towgs84,0,7,{174,359,365,0,0,0,0}},{units,units_m},{no_defs}}}},
            {2005, {{{proj,proj_tmerc},{lat_0,0,1,{0}},{lon_0,0,1,{-62}},{k,0,1,{0.9995000000000001}},{x_0,0,1,{400000}},{y_0,0,1,{0}},{ellps,ellps_clrk80},{units,units_m},{no_defs}}}},
          // some more thousands rows follows here
        };
    }
}

Make the table/instantiation part of a .cpp file

If the table was created as part of the compilation of the boost library itself, instead of header-only for this part, that might work.
Maybe in combination with the speedup approach, 30 minutes is still some time, for a single file.

A variant is to control the inline/static part with a compile-time define, so that affected users can choose to compile the table-part in a separate unit (avoiding repeated compilations of the stuff that takes time)
Drawbacks with this approach is clearly that it breaks header-only, so the 'hybrid' user def constant specified approach could work, helping out the users that are affected.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions