Skip to content

Bug: progress=true consumes the iterator #303

Open
@perrygeo

Description

@perrygeo

Describe the bug
The progress bar feature (introducing in #300 and release in 0.20) consumes the iterator in order to get the length of the sequence. This leaves zero features left to analyze; all zonal_stats(..., progress=True) calls will return []

To Reproduce

Steps to reproduce the behavior:
Install rasterstats >= 0.20 and run

from rasterstats import zonal_stats

v = "tests/data/polygons.shp"
r = "tests/data/slope.tif"

a = zonal_stats(v, r, progress=False)
b = zonal_stats(v, r, progress=True)
assert a == b   # FAILS

Potential Solutions

The design of the io module is very explicitly a stateful generator (hence the gen in gen_zonal_stats) so we can't consume it to determine the length.

  1. Remove the explicit length (cons: progress bar will be useless with a sequence of unknown length)
  2. Iterate twice (con: performance hit but at least it would be correct)
  3. Add a length-hinting private function to the io module so that we can determine the length of any vector input efficiently. (con: may not be possible to do determine length efficiently for all inputs)

None of these are good options. If we can't figure out a solution, I'll need to remove the progress bar for correctness sake.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions