Description
_add_provider_example_dags_to_bundle in airflow-core/src/airflow/dag_processing/bundles/manager.py currently derives the Python package path of an installed provider from its distribution name with a string heuristic:
if package_name.startswith("apache-airflow-providers-"):
suffix = package_name[len("apache-airflow-providers-") :]
module_name = "airflow.providers." + suffix.replace("-", ".")
else:
module_name = package_name.replace("-", "_")
This works for canonical Apache providers because they follow a fixed layout convention, but it has known weak points:
- third-party providers whose distribution name does not start with
apache-airflow-providers- fall back to a naive replace("-", "_") which is often wrong;
- the dash-to-dot reverse is lossy:
foo-bar and foo_bar both come back as foo.bar.
Why entry_point.module does not directly solve this
Provider entry points are declared as e.g. airflow.providers.standard.get_provider_info:get_provider_info. entry_point.module resolves to the get_provider_info module file, not the package, so importlib.import_module(entry_point.module) returns a module without __path__. Walking __path__ to find example_dags/ would iterate zero times.
Proposed approach
Extend ProviderInfo in shared/providers_discovery/src/airflow_shared/providers_discovery/providers_discovery.py to record the parent package path of each discovered provider:
- derive it from
entry_point.module by stripping the trailing .get_provider_info segment, or
- surface it from the dist metadata (
dist.files, top_level.txt, etc.).
Mirror the field on airflow.providers_manager.ProviderInfo so consumers in airflow-core can read it.
Acceptance criteria
ProviderInfo exposes a stable attribute (e.g. module_path or package_module) populated for every discovered provider.
- The heuristic block in
_add_provider_example_dags_to_bundle is replaced with a direct importlib.import_module(info.module_path) lookup.
- The tracking comment
# tracked at <issue-url> in manager.py is removed.
- Tests cover both canonical Apache providers (
apache-airflow-providers-standard, apache-airflow-providers-common-sql) and a synthetic third-party provider with a non-canonical distribution name.
Context
Surfaced during review of #66161 (#66161) by @potiuk and @jscheffl. Deferred to keep that PR focused on the example-DAG-bundle migration; this issue captures the proper fix.
Description
_add_provider_example_dags_to_bundleinairflow-core/src/airflow/dag_processing/bundles/manager.pycurrently derives the Python package path of an installed provider from its distribution name with a string heuristic:This works for canonical Apache providers because they follow a fixed layout convention, but it has known weak points:
apache-airflow-providers-fall back to a naivereplace("-", "_")which is often wrong;foo-barandfoo_barboth come back asfoo.bar.Why
entry_point.moduledoes not directly solve thisProvider entry points are declared as e.g.
airflow.providers.standard.get_provider_info:get_provider_info.entry_point.moduleresolves to theget_provider_infomodule file, not the package, soimportlib.import_module(entry_point.module)returns a module without__path__. Walking__path__to findexample_dags/would iterate zero times.Proposed approach
Extend
ProviderInfoinshared/providers_discovery/src/airflow_shared/providers_discovery/providers_discovery.pyto record the parent package path of each discovered provider:entry_point.moduleby stripping the trailing.get_provider_infosegment, ordist.files,top_level.txt, etc.).Mirror the field on
airflow.providers_manager.ProviderInfoso consumers inairflow-corecan read it.Acceptance criteria
ProviderInfoexposes a stable attribute (e.g.module_pathorpackage_module) populated for every discovered provider._add_provider_example_dags_to_bundleis replaced with a directimportlib.import_module(info.module_path)lookup.# tracked at <issue-url>inmanager.pyis removed.apache-airflow-providers-standard,apache-airflow-providers-common-sql) and a synthetic third-party provider with a non-canonical distribution name.Context
Surfaced during review of #66161 (#66161) by @potiuk and @jscheffl. Deferred to keep that PR focused on the example-DAG-bundle migration; this issue captures the proper fix.