All notable changes to this project will be documented in this file. If you make a notable change to the project, please add a line describing the change to the "unreleased" section. The maintainers will make an effort to keep the Github Releases page up to date with this changelog. The format is based on Keep a Changelog.
- Features and updates:
WakeLossesupdates- Option added to the
WakeLossesanalysis method to correct for freestream wind speed heterogeneity across a wind plant when estimating internal wake losses. The method relies on a user-provided freestream wind speedup csv file. The wake loss example notebook has been updated to illustrate how to use this option. WakeLossesanalysis method updated to flag and exclude unrealistic turbine wind speed measurements
- Option added to the
MonteCarloAEPupdates- Add an
n_jobsinput to the Monte Carlo AEP method to allow for the underlying models to be parallelized during each iteration for faster ML model computation. - Add an
apply_iavinput to the Monte Carlo AEP analysis method to toggle the addition of the IAV factor at the end of the analysis. - Add a
progress_barflag toMonteCarloAEP.run()to allow for turing on or off the simulation's default progress bar.
- Add an
- Option added to the IEC power curve model in the
openoa/utils/power_curve/functionsmodule to linearly interpolate power between wind speed bin centers - Implement missing
compute_wind_speedinopenoa/utils/met_data_processing.pyand apply it to thePlantDatareanalysis validation steps in place of the manual calculation. - Functions for downloading hourly ERA5 and MERRA-2 reanalysis data added to the
openoa/utils/downloadermodule.
- Fixes:
- Add a default value for
PlantData'sasset_distance_matrixandasset_direction_matrixto ensure projects not utilizing location data are compatible. - Fix miscellaneous pandas warnings.
- Rerun pre-commit and update code styling for adherence to 3.10+ standards.
- Replace mutable default arguments with None and handle Nones internally.
- Add a default value for
- During the custom test collection, convert the
Pathobjects tostrto avoid issues with the type enforcement oflist[str]forargsin Pytest v9. - Update PyGAM minimum version for its latest update that includes Python 3.10-3.13 support.
- Remove maximum version pins for scipy and statsmodels with the support of the latest Python versions.
- Adds a maximum version for scikit-learn for a change in their
__sklearn_tags__support. - Deprecate support for Python 3.8 and 3.9, with additional support for Python 3.12 and 3.13.
- Update the min and max versions to test in the testing CI workflow.
- Utilizes pytest xfail and subtests to manange intermittent and finnicky test failures until a long-term solution can be implemented.
- Pin Pandas maximum version to the 2.x release cycle.
- Pin SciPy to >= 1.7 and <1.14 to avoid an incompatibility error with PyGAM.
- Updates the Anaconda recommendation to alert users to the fact that Anaconda is no longer technically free, and so commercial users should consider Miniforge or Miniconda installations.
- Updates the GitHub Actions workflows to use the latest versions of their workflow dependencies.
- Changes a reference from the old
MonteCarloAEP._hours_in_resto the newMonteCarloAEP.resample_hours. - Adds a step to filling in gaps in a time series that checks to see if there were any missing data, which helps avoid Pandas warnings.
- Patches
pyproject.toml's package data specification to include openoa in the valid packages to install.
- Updated compatibility with Pandas datetime offsets. All uppercase offset strings representing
one hour or less have been replaced with the lowercase version. This stems from an update in the
Pandas frequency API that breaks in 2.2.0. See the below changes to update frequency settings. The
soon-to-be-deprecated style from Pandas will continue to be supported in OpenOA, but will display
a
DeprecationWarningwith support extending until OpenOA v4.- M -> ME (MS still allowed)
- H -> h
- T -> min
- S -> s
- L -> ms
- U -> us
- N -> ns
- Replaced the "ME" default time basis with "MS" to maintain consistency with the examples.
- Fixes a bug in the frequency validation where a monthly frequency offset is attempted to be converted into seconds. Prior to Pandas 2.0 this was supported, but "M" would return 1 minute, so OpenOA will no longer attempt to convert "ME" or "MS", which are unsupported or incorrect, respectively.
- Python 3.11 is now supported.
- Updates the dependency requirements to minimize the number of required packages, and have a more
expansive list of modifiers. Users can now use any combination of
pip install openoa[examples, develop, docs, nrel-wind, reanalysis]to ensure the appropriate packages are installed for their workflow. - Adds a
--unitand--regressionflag for running pytest that works in addition topytest test/unitorpytest test/regression. - Converts some configuration files into
pyproject.tomlsettings to reduce visual clutter at the top-level of the directory. - Updates chained
.locexpressions to be a single.locexpression in project_ENGIE.py to silence a Pandas deprecation warning about future changes. - Adds a missing NaN assignment to
project_ENGIE.py:clean_scada, which causes a slight change in results for the TIE and wake loss regression tests. openoa.utils.timeseries.gap_fill_data_frame()now returns the original data if there is no data to fill in, avoiding a Pandasconcatdeprecation warning about pending behavioral changes.- The turbine capacity value used for power curve filtering in
TurbineLongTermGrossEnergyis changed to the rated power from the asset table instead of the maximum power from SCADA. This makes the power curve filtering more robust to turbine power outliers above rated power. - Fixed a minor bug in the Cubico example workflow that caused the download of reanalysis data without checking for its existence, unlike what is done with the project data.
- Updates the README file and documentation site homepage to be more user friendly.
- Includes warnings about limitations and lack of validation of static yaw misalignment method.
Please see the updated documentation for a complete overview of the new and improved OpenOA. Much will look familiar, but using the library should now be much more streamlined, and usage should be significantly faster.
from openoa import PlantData- The
PlantDataclass has been entirely reorganized around attrs dataclasses and direct use of Pandas data frames. For more details on usage, please check the examples page of the documentation or the updated API documentation for details. PlantDatanow validates user data based on the data schema provided by the user through thePlantMetaDataobject. See the links above for details and usage, which means no more need to subclassPlantDataand write a custompreparemethod. Now users simply define their data schema, andPlantDatais able to do all of the work and validate the data.- IEC 61400-25 tag names are now used throughout the code base for column naming/calling conventions
- The package name is changed from
operational_analysistoopenoato be more consistent with how we expect to import OpenOA! - Common methods are now readily available through
PlantData, such asPlantData.turbine_ids,PlantData.tower_ids,PlantData.turbine_df("turb_id"), orPlantData.tower_df("tower_id") - Better
__repr__methods forPlantDataandPlantMetaData.- Improved
__repr__methods that can detect Jupyter Notebooks or terminal usage to print as a string or as markdowns. - Printing a
PlantDataobject now provides a high level statistical summary of each of the datasets inPlantData, alongside other key variables. - Printing a
PlantMetaDataobject now shows the default or provided column mapping with the associated expected dtypes and units, alongside other key variables. - Creating a class will take all of the same parameters, moving all data validation parameters to the front of the arguments for each class, so check your class initializations when changing versions.
AnalysisClass.run()now takes all of the same arguments as the class initialization, except for those that modify what data will be validated. For example,MonteCarloAEPhas argumentsreg_temperatureandreg_wind_direction, which flag if additional columns should be present in the reanalysis data, therefore modifying the data validation requirements. As such, they will not be able to updated inrun(), and a new analysis class instance will need to be created.reanalysis_subsetis being replaced withreanalysis_productsin all cases to use a consistent naming convention across classes.
- Improved
- Analysis requirements and minimum schema have been provided in the
openoa/schemalibrary. To review a dictionary of the minimal data requirements for an anaylsis, users may view theANALYSIS_REQUIREMENTSfound inopenoa/schema/metadata.py, or be importing it and viewing as a dictionaryfrom openoa.schema.metadata import ANALYSIS_REQUIREMENTS. Alternatively there is a simple landing page for analysis-specific schema files available in the schema readme
-
from openoa.analysis import MonteCarloAEP -
A static yaw misalignment analysis class
StaticYawMisalignmenthas been added to estimate static yaw misalignment as a function of wind speed for individual wind turbines using turbine-level SCADA data -
A new
WakeLossesanalysis class has been added to estimate wake losses utilizing either turbine-level or met-tower level wind conditions. -
Hard-coded reanalyis product abbreviation requirements in the analysis classes have been moved to check that the provided abbreviations match the reanalysis abbreviations used for the
PlantData.reanalysisdictionary keys. -
A deep copy of the original
PlantDataobject is now stored in the analysis class so that the project data is stable between uses, allowing more flexibility for users running a variety of analyses. -
Analysis classes are now attached to
PlantDataat the time of import, maintaining the same behavior as a standalone analysis class import. For example, the following two import patters produce the same resultsfrom openoa import PlantData from openoa.analysis import WakeLosses kwargs = { metadata="path_to_metadata", scada="scada data or path to CSV file", meter="meter data or path to CSV file", tower="tower data or path to CSV file", asset="asset data or path to CSV file", reanalysis={"product_key": "data or path to CSV file"}, status="status data or path to file", } project = PlantData.from_dict(kwargs) # Original pattern, that is still in operation wake_classic = WakeLosses(project) # New, equivalent pattern wake_new = project.WakeLosses()
-
All analysis inputs are able to be provided at the initialization or run level, allowing more flexibility for when analyses are designed and modified. Additionally, the analysis defaults are set at initialization, so settings are only changed between runs if the users specifies a change.
-
The only settings that cannot be modified in an analysis run are those that change the underlying data settings, which will now require a new analysis method. See the following example:
from openoa import PlantData project = PlantData() # note: kwargs must actually be provided to create a PlantData object # Use and validate of the SCADA temperature data aep = project.MonteCarloAEP(reg_temperature=True) # No longer allowed because this adds a new wind direction data requirement, which may # not have been validated aep.run(reg_wind_direction=True) # New method for running variations on the underlying data, which do not modify the original # project data in any way aep_temp = project.MonteCarloAEP(reg_temperature=True) aep_wd = project.MonteCarloAEP(reg_wind_direction=True) # Compare your results ...
-
TurbineLongTermGrossEnergy.filter_turbine_datawas cleaned up for a minor gain in efficiency and readability.
toolshas been renamed toutilspandas_plottinghas been renamed toplot, and a new, more customizable plotting API has been implemented allowing for publication-quality figures to be generated with ease.- Added downloader utils module containing functions for downloading generic files from the web, downloading files from Zenodo, and downloading monthly-resolution ERA5 and MERRA2 data.
- Nearly all methods can operate on a Pandas DataFrame with provided column names, or pandas Series for the parameters, and return back the data in the same manner.
- Massive spedups across the board by using the most efficient Pandas and/or NumPy code under the hood to power the same methods with a more polished and robust interface.
- Updated documentation for users and contributors in the Getting Started section.
- New and improved contributing guide.
- All notebooks have been updated to use our new API and demonstrate its usage.
- New notebooks dedicated solely to introducing new concepts.
- Added example notebook "02c_plant_aep_analysis_cubico.ipynb" that demonstrates creating a
PlantDataobject and running AEP analysis for two Cubico wind plants (Kelmarsh and Penmanshiel) using open data downloaded from Zenodo - The new
06_wake_loss_analysisexample notebook highlights the newWakeLossesanalysis class using the La Haute Borne data. - The new
07_static_yaw_misalignmentexample notebook demonstrates the application of the yaw misalignment method using the example La Haute Borne data
- Upgrading past major versions of Scikit-Learn (1.0) and Pandas (2.0), in conjunction with their own dependencies, caused small divergences in the MonteCarloAEP analysis method with Daily GBM, and the Wake Losses analysis method with UQ. The magnitude of the differences are small compared with the magnitude of the output.
- In general, OpenOA is now moving away from pinning the maximum dependency version, and will stick to defining minimum dependencies to ensure modern API usage is supported across the software.
- The following methods have been removed from the plotting library,
utils/plot.pygiven that they have either been replaced with newer methods or have been unused and unmaintained for long enough that their original intent is lost. If you still use any of these, please let us know in the Issues, and we'll be happy to bring it up to date:plot_arraysubplot_powerRose_arraypowerRose_arraysubplot_c1_c2subplot_c1_c2_flaggedsubplot_c1_c2_raw_flaggedsubplt_power_curveturbine_polar_lineturbine_polar_4Dscatterturbine_polar_contourfturbine_polar_contour
- Everything from release candidate 1
- IEC 61400-25 tag names are now used throughout the code base for column naming/calling conventions
- Wake Loss Method now released(!) and available via:
from openoa.analysis import WakeLosses.
- The package name is changed from
operational_analysistoopenoato be more consistent with how we expect to import OpenOA! PlantDatais now fully based on attrs dataclasses and utilizing the pandasDataFramefor all internal data structuresPlantDatacan now be imported viafrom openoa import PlantData- By using attrs users no longer have to subclass
PlantDataand create their ownPlantData.preparemethod. - Users can now bring their own column naming, and provide a metadata definition so columns are mapped under the hood through the
PlantMetaDataclass (see Intro Example for more information!) PlantData.scada(or similar) is now used in place of accessing the SCADA (or similar) dataframe- v2
ReanalysisDataandAssetDatamethods have been absorbed byPlantDatain favor of a unified data structure and means to operate on data. - v2
TimeSeriesTableis removed in favor of a pandas-based API and data usage
- openoa has a new import structure
PlantDatais available at the top level:from openoa import PlantData- tookits -> utils via
from openoa.utils import xx- pandas_plotting -> plot
- quality_check_automation -> qa (formerly located in methods)
- methods -> analysis via
from openoa.analysis import xx
- Convenience methods such as
PlantData.turbine_idsorPlantData.tower_df(tower_id="x")have been added to address commonly used code patters - Analysis methods are now available through
from openoa.analysis import <AnalysisClass> - A wake loss analysis class
WakeLosseshas been added to estimate operational wake losses using turbine-level SCADA data- The new
06_wake_loss_analyisexample notebook demonstrates how to use the wake loss analysis method
- The new
- Renamed
compute_shear_v3tocompute_shearand deleted old version ofcompute_shear. - The
utilssubpackage has been cleaned up to take both pandasDataFrameandSeriesobjects where appropriate, refactors pandas code to be much cleaner for both performance and readability, has more user-friendly error messages, and has more consist outputs openoa.utils.imputing.correlation_matrix_by_id_columnhas been renamed toopenoa.utils.imputing.asset_correlation_matrix- A new 00_x example notebook is replace the 1a/b QA examples to highlight how the
project_ENGIE.pymethods are created. This creates an example for users to work with and significantly more details on how to use the newPlantDataandPlantMetaDatamethods. - Documentation reorganization and cleanup
- Replaced hard-coded reanalysis dates in plant analysis with automatic valid date selection and added optional user-defined end date argument. Fixed bug in normalization to 30-day months.
- Toolkit added for downloading reanalysis data using the PlanetOS API
- Added hourly resolution to AEP calculation
- Added wind farm plotting function to pandas_plotting toolkit using the Bokeh library
- Split the QC methods into a more generic
WindToolKitQualityControlDiagnosticSuiteclass and WTK-specific subclass:WindToolKitQualityControlDiagnosticSuite. - Updated filter algorithms in AEP calculation, now with a proper outlier filter
- IAV incorporation in AEP calculation
- Set power to 0 for windspeeds above and below cutoff in IEC power curve function.
- Split unit tests from regression tests and updated CI pipeline to run the full regression tests weekly.
- Flake8 with Black code style implemented with git hook to run on commit
- Updated long-term loss calculations to weight by monthly/daily long-term gross energy
- Added wind turbine asset data to example ENGIE project
- Reduce amount of time it takes to run regression tests by decreasing number of monte carlo iterations. Reduce tolerance of float comparisons in plant analysis regression test. Linear regression on daily data is removed from test.
- Bugfixes, such as fixing an improper python version specifier in setup.py and replacing some straggling references to the master branch with main.
- Modify bootstrapping approach for period of record sampling. Data is now sampled with replacement, across 100% of the POR data.
- Cleaned up dependencies for JOSS review. Adding peer-reviewed JOSS paper.
- Add Binder button to Readme which makes running the example notebooks easier.
- Set maximum python version to 3.8, due to an install issue for dependency Shapely on Mac with Python 3.9.
- Replaced
GeoPandasfunctionality withpyprojandShapelyfor coordinate reference system conversion and distance measurements. - Moved and renamed tests and updated the documentation accordingly.
- Switch to semantic versioning from this release forward.
- Efficiency improvements in AEP calculation
- Energy Yield Analysis (EYA) added to Operational Assessment (OA) Gap Analysis method
- Uncertainty quantification for electrical losses and longterm turbine gross energy
- Implemented open source Engie example data
- Complete update of example notebooks
- Switch to standard BSD-3 Clause license
- Automated quality control method to assist with data ingestion. Tools in this method include daylight savings time change detection and identification of the diurnal cycle.
- Add electrical losses method
- Method for estimating long-term turbine gross energy (excluding downtime and underperformance losses)
- CI pipeline using Github Actions includes regression testing with Pytest, code coverage reporting via CodeCov, packaging and distribution via Pypi, and automatic documentation using ReadTheDocs.
- Python3 Support
- Addition of reanalysis schemas to the Sphinx documentation
- Easy import of EIA data using new module: Metadata_Fetch
- Updated contributing.md document
- Quality checks for reanalysis data
- Improved installation instructions
- Integration tests are now performed in CI
- Performed PEP8 linting
- Refactor many analysis and toolkit modules to make them conform to a standard API (init, prepare, and run method).
- Timeseries Table is now an integrated component, no sparkplug-datastructures dependency
- Plant Level AEP method w/ Monte Carlo
- Turbine / Scada level toolkits: Filtering, Imputing, Met, Pandas Plotting, Timeseries, Unit Conversion
- Most toolkits and all methods are fully documented in Sphinx.
- Two example notebooks: Operational AEP Analysis and Turbine Analysis
- All toolkits except for Pandas Plotting have > 80% test coverage.