Skip to content

Conversation

@theroggy
Copy link
Member

@theroggy theroggy commented Sep 4, 2025

Some GDAL functions (e.g. executing SQL statements,...) can take a while, so nogil-ling those can already give significant performance improvements when using multithreading.

Remark: determining exactly which GDAL functions take time is not trivial as this can depend on the file format,... For files types with a stricter data schema (e.g. shapefile,...) determining column types is fast, for some other file types (e.g. geojson) the file needs to be read entirely first to determine all data types,.... Hence, a relatively large subset of GDAL calls have been no-gilled to avoid missing significant situations.

theroggy added 30 commits April 3, 2022 01:59
@theroggy theroggy changed the title ENH: unlock the gil during execute of sql ENH: unlock the gil during GDALDatasetExecuteSQL Sep 5, 2025
@theroggy theroggy modified the milestone: 0.11.0 Sep 5, 2025
@jorisvandenbossche
Copy link
Member

Looks good to me!

@theroggy theroggy marked this pull request as ready for review September 5, 2025 19:45
@theroggy theroggy added this to the 0.11.0 milestone Sep 5, 2025
@theroggy theroggy changed the title ENH: unlock the gil during GDALDatasetExecuteSQL ENH: unlock the gil during GDAL functions that can take significant time Sep 8, 2025
@theroggy theroggy modified the milestones: 0.11.0, 0.12.0 Sep 14, 2025
Comment on lines +502 to +503
with nogil:
ogr_feature = OGR_L_GetNextFeature(ogr_layer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are in a loop here for counting the features, this means for a file with 10,000 rows, we are going to release/acquire the GIL 10,000 times?
(if I am not misreading the code, that might not be optimal)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably move the with nogil up, though, and put the entire loop in here. We just need to acquire the GIL again with gil: when raising the errors

Copy link
Member Author

@theroggy theroggy Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche
I reran my performance tests with and without having with nogil around functions like OGR_L_GetNextFeature in the feature-getting, very high quantity loops using the nz building outlines file.

Conclusion: it doesn't seem to give a significant difference either way:

  • When reading the entire new zealand building outlines file (3.3 million features) there is no difference whether the nogil is there or not, while it is passed 3.3 million times.
  • Also when running in parallel threads the differences are within the fluctuations I see when running the script multiple times.

It seems the cost of having it there is not significant, at least in the tests I did. There might be file formats where having the nogil there could be beneficial, but this is pure speculation. Also possible that the nogil behaves a bit different on a loaded server than on a windows desktop, also specfulation.

Hence, I don't have any preference for keeping it or not.

Remark: moving it up, so the entire loop is nogilled takes some effort as check_pointer returns a python object (an exception), which isn't trivial to change... and as there doesn't seem to be a performance impact anyway I don't think it is worth the trouble.

With nogil (also) in the feature reading loops (e.g. around OGR_L_GetNextFeature)

************ with use_arrow=True *******************
len(df)=15, with sql, use_arrow=True took 2.453807899990352
len(df)=15, with where, use_arrow=True took 2.420489699987229
read, lens=[8, 2, 2, 3] sequential, sql, use_arrow=True, took=2.5079255000164267
read, lens=[8, 2, 2, 3] sequential, where, use_arrow=True, 2.5423938000167254
read, lens=[8, 2, 2, 3], parallel, sql, use_arrow=True, took=1.4490034999907948
read, lens=[8, 2, 2, 3], parallel, where, use_arrow=True, took=1.2845507999882102
len(df)=3289194, no filter, use_arrow=True took 7.136394500004826
************ with use_arrow=False *******************
len(df)=15, with sql, use_arrow=False took 5.860275599989109
len(df)=15, with where, use_arrow=False took 7.902896300016437
read, lens=[8, 2, 2, 3] sequential, sql, use_arrow=False, took=6.309563500020886
read, lens=[8, 2, 2, 3] sequential, where, use_arrow=False, 6.09413590002805
read, lens=[8, 2, 2, 3], parallel, sql, use_arrow=False, took=2.110850199998822
read, lens=[8, 2, 2, 3], parallel, where, use_arrow=False, took=3.5822784999909345
len(df)=3289194, no filter, use_arrow=False took 50.74965380001231

Without nogil in the feature-reading loops

************ with use_arrow=True *******************
len(df)=15, with sql, use_arrow=True took 2.415697899996303
len(df)=15, with where, use_arrow=True took 2.8347185000020545
read, lens=[8, 2, 2, 3] sequential, sql, use_arrow=True, took=2.520717900013551
read, lens=[8, 2, 2, 3] sequential, where, use_arrow=True, 2.9033705000183545
read, lens=[8, 2, 2, 3], parallel, sql, use_arrow=True, took=1.8013735999993514
read, lens=[8, 2, 2, 3], parallel, where, use_arrow=True, took=1.212515799998073
len(df)=3289194, no filter, use_arrow=True took 7.0017002999957185
************ with use_arrow=False *******************
len(df)=15, with sql, use_arrow=False took 5.824918400001479
len(df)=15, with where, use_arrow=False took 7.959993800002849
read, lens=[8, 2, 2, 3] sequential, sql, use_arrow=False, took=6.387824200006435
read, lens=[8, 2, 2, 3] sequential, where, use_arrow=False, 6.192911899997853
read, lens=[8, 2, 2, 3], parallel, sql, use_arrow=False, took=2.3500404999940656
read, lens=[8, 2, 2, 3], parallel, where, use_arrow=False, took=3.5810892000154126
len(df)=3289194, no filter, use_arrow=False took 51.31016650001402

Script used:

"""Performance testing for multithreaded reading."""

import time
from concurrent import futures
from time import perf_counter

import pyogrio

if __name__ == "__main__":
    path = "C:/temp/lds-nz-building-outlines/nz-building-outlines.gpkg"
    # path = r"C:\Temp\prc2023\prc2023.gpkg"
    nb_workers = 4

    layer = pyogrio.list_layers(path)[0][0]
    info = pyogrio.read_info(path, layer=layer, force_feature_count=True)
    featurecount = info["features"]

    where = "{filter} ST_NPOINTS(st_buffer(geom, 10)) > 2000"

    # Set filter so each chunk in the parallel read returns some features, to avoid a
    # bug in gdal with arrow where empty results are a lot slower.
    where = "{filter} ST_NPOINTS(geom) > 500"
    sql_template = f"""
        SELECT * from "{layer}"
            WHERE {where}
        """

    for use_arrow in [True, False]:
        print(f"************ with {use_arrow=} *******************")

        start = perf_counter()
        df = pyogrio.read_dataframe(
            path, sql=sql_template.format(filter=""), use_arrow=use_arrow
        )
        print(f"{len(df)=}, with sql, {use_arrow=} took {perf_counter() - start}")

        start = perf_counter()
        df = pyogrio.read_dataframe(
            path, where=where.format(filter=""), use_arrow=use_arrow
        )
        print(f"{len(df)=}, with where, {use_arrow=} took {perf_counter() - start}")

        time.sleep(5)  # Give cpu's a break

        # Test multithreading performance
        # -------------------------------
        # Divide featurecount in nb_workers ranges
        ranges = []
        for i in range(nb_workers):
            start_fc = int(i * featurecount / nb_workers)
            end_fc = int((i + 1) * featurecount / nb_workers)
            ranges.append(f"fid >= {start_fc} AND fid < {end_fc} AND ")

        # Test reading sequentially with sql
        start = perf_counter()
        lens = []
        for filter in ranges:
            sql = sql_template.format(filter=filter)
            df = pyogrio.read_dataframe(path, sql=sql, use_arrow=use_arrow)
            lens.append(len(df))
        took = perf_counter() - start
        print(f"read, {lens=} sequential, sql, {use_arrow=}, {took=}")

        # Test reading sequentially with where
        start = perf_counter()
        lens = []
        for filter in ranges:
            df = pyogrio.read_dataframe(
                path, where=where.format(filter=filter), use_arrow=use_arrow
            )
            lens.append(len(df))
        print(
            f"read, {lens=} sequential, where, {use_arrow=}, {perf_counter() - start}"
        )

        time.sleep(5)  # Give cpu's a break

        # Test reading in parallel with sql
        start = perf_counter()
        with futures.ThreadPoolExecutor(max_workers=nb_workers) as executor:
            results = list(
                executor.map(
                    lambda f, ua: pyogrio.read_dataframe(
                        path, sql=sql_template.format(filter=f), use_arrow=ua
                    ),
                    ranges,
                    [use_arrow] * nb_workers,
                )
            )
            lens = [len(df) for df in results]

        took = perf_counter() - start
        print(f"read, {lens=}, parallel, sql, {use_arrow=}, {took=}")

        # Test reading in parallel with where
        start = perf_counter()
        with futures.ThreadPoolExecutor(max_workers=nb_workers) as executor:
            # print(f"where: {where.format(filter=ranges[0])}")
            results = list(
                executor.map(
                    lambda f, ua: pyogrio.read_dataframe(
                        path, where=where.format(filter=f), use_arrow=ua
                    ),
                    ranges,
                    [use_arrow] * nb_workers,
                )
            )
            lens = [len(df) for df in results]

        took = perf_counter() - start
        print(f"read, {lens=}, parallel, where, {use_arrow=}, {took=}")

        # Finally, read without any filter
        # --------------------------------
        start = perf_counter()
        df = pyogrio.read_dataframe(path, use_arrow=use_arrow)
        print(f"{len(df)=}, no filter, {use_arrow=} took {perf_counter() - start}")

Comment on lines +1168 to +1169
with nogil:
ogr_feature = OGR_L_GetNextFeature(ogr_layer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment here, although in this case we can't really move it up (and a few similar cases below as well)

@theroggy theroggy modified the milestones: 0.12.0, 0.12.1 Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants