Skip to content

mltable / Azureml Dataprep seems to be still dependent on pkg_resources and thus is broken with python 3.12 #38915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
lucasfijen opened this issue Dec 17, 2024 · 4 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. dataprep Issues subcategorized for ML dataprep library Machine Learning needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.

Comments

@lucasfijen
Copy link
Contributor

lucasfijen commented Dec 17, 2024

  • Package Name: mltable / azureml dataprep
  • Package Version: 1.6.1
  • Operating System: Linux
  • Python Version: 3.12.8

Describe the bug

Hi,
I was glad to see that azureml now should be supporting 3.12, but it seems like not all my azureml dependencies truely support 3.12 yet.

What it seems like is that azureml dataprep still is dependent on pkg_resources, and azureml dataprep is a dependency for mltable:

ImportError while importing test module '/home/vsts/work/1/s/tests/ourtest.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/importlib/__init__.py:90: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ourcodepathhere.py:4: in <module>
    import mltable
/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/mltable/__init__.py:11: in <module>
    from .mltable import MLTable, load, from_delimited_files, from_parquet_files, from_json_lines_files, from_paths, \
/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/mltable/mltable.py:19: in <module>
    from azureml.dataprep.api._loggerfactory import track, _LoggerFactory, trace
/opt/hostedtoolcache/Python/3.12.8/x64/lib/python3.12/site-packages/azureml/dataprep/api/_loggerfactory.py:8: in <module>
    import pkg_resources
E   ModuleNotFoundError: No module named 'pkg_resources'

pip freeze slimmed down to relevant packages:

azure-ai-ml==1.23.0
azure-common==1.1.28
azure-core==1.32.0
azure-cosmos==4.9.0
azure-identity==1.19.0
azure-keyvault-secrets==4.9.0
azure-mgmt-core==1.5.0
azure-storage-blob==12.19.0
azure-storage-file-datalake==12.14.0
azure-storage-file-share==12.20.0
azureml-dataprep==5.1.6
azureml-dataprep-native==41.0.0
azureml-dataprep-rslex==2.22.5
azureml-fsspec==1.3.1
azureml-mlflow==1.58.0
mltable==1.6.1
opencensus==0.11.4
opencensus-context==0.1.3
opencensus-ext-azure==1.1.13
opencensus-ext-logging==0.1.1

Hope we can sort this out!

To Reproduce
As above using or importing mltable results into this dependency still relying on pkg_resources, which is deprecated in python 3.12.

Expected behavior
I expected for python 3.12 to be compatible as stated in azure ml package documentation.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Machine Learning needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team. labels Dec 17, 2024
Copy link

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Azure/azure-ml-sdk @azureml-github.

@lucasfijen lucasfijen changed the title mltable / Azureml Dataprep seems to be still dependend on pkg_resources and thus is broken with python 3.12 mltable / Azureml Dataprep seems to be still dependent on pkg_resources and thus is broken with python 3.12 Dec 17, 2024
@canwaf
Copy link

canwaf commented Feb 19, 2025

I was about to file a similar bug report, which I'll just include below.

tl;dr pip install setuputils will solve this one for you @lucasfijen if it is similar to mine.

Potential new issue with similar problems.

  • Package Name: azureml-core
  • Package Version: azureml-core==1.59.0.post1
  • Operating System: ubuntu 20.04
  • Python Version: Python 3.12.6

Describe the bug
When generating a output download script from an Azure PromptFlow run, the following python cell is created:

import csv
import json
import logging
import requests
from pathlib import Path
from azureml.core import Workspace
from azureml.core.authentication import InteractiveLoginAuthentication

Without setuputils, from azureml.core import Workspace will fail as module pkg_resources not found (which is supplied by setuputils.

To Reproduce
Steps to reproduce the behavior:

  1. Fresh python environment
  2. pip install azureml-core
  3. from azureml.core import Workspace will fail.

Expected behavior
Package dependencies correctly configured in azureml-core.

@lucasfijen
Copy link
Contributor Author

Hi @canwaf ,

Nice addition to this ticket. I think it comes from indeed the same source. Although setuputils might indeed temporarely fix this issue, it is not a desired solution, as it would just postpone problems (as has been done in python 3.11). pkg_resources is deprecated from python 3.12 for a reason, and i think from microsoft side they need to update their resources for this.

Also as per their documentation: https://setuptools.pypa.io/en/latest/pkg_resources.html

Use of pkg_resources is deprecated in favor of importlib.resources, importlib.metadata and their backports (importlib_resources, importlib_metadata). Some useful APIs are also provided by packaging (e.g. requirements and version parsing). Users should refrain from new usage of pkg_resources and should work to port to importlib-based solutions.

(see also a nice writeup here: mu-editor/mu#2485).

Hope to hear anything from microsoft side on this issue @vivram

@kingernupur kingernupur added the dataprep Issues subcategorized for ML dataprep library label Feb 27, 2025
@jenshnielsen
Copy link
Contributor

I think there is a bit of confusion in the above so I just want to clarify.
Specifally two mostly independent things happened.

  • pkg_resources is a part of setuptools and always has been. This means it's not deprecated from any version of python it is deprecated from setuptools. Specifically, this happened officially in setuptools 67.5.0
  • virtualenv/venv used to implicitly install setuptools into any newly created environments since it was such a common dependency for building packages. With the pep specifying build requirements and building packages in isolated environments setuptools is no longer installed automatically and a package that depends on setuptools must declare this a dependency. Depending on setuptools includes importing setuptools and pkg_resources. For venv this happened in 3.12

There are therefor two separate bugs in the azure pacakages.

  • A missing dependency on setuptools. This is a bug that has probably existed since creation of the Azure python sdk but only become visible lately.
  • Use of pkg_resources which is deprecated and should be migrated the recommended modern solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. dataprep Issues subcategorized for ML dataprep library Machine Learning needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Attention Workflow: This issue is responsible by Azure service team.
Projects
None yet
Development

No branches or pull requests

5 participants