Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem following basic usage of jupyter mlflow and prefect tutorial #67

Open
Pfriasf opened this issue Aug 23, 2021 · 20 comments
Open

Comments

@Pfriasf
Copy link

Pfriasf commented Aug 23, 2021

Good afternoon,

In the prefect configuration step, I get the following error:

---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/tmp/ipykernel_164/2133135275.py in <module>
     42 flow_run_id = prefect_client.create_flow_run(flow_id=training_flow_id, run_name=f "run {prefect_project_name}")
     43 
---> 44 create_prefect_flow()

/tmp/ipykernel_164/2133135275.py in create_prefect_flow()
     30 storage = S3(s3_bucket)
     31 
---> 32 session_token = get_prefect_token()
     33 prefect_client = Client(api_server=prefect_url, api_token=session_token)
     34 schedule = IntervalSchedule(interval=timedelta(minutes=2))

/tmp/ipykernel_164/2133135275.py in get_prefect_token()
     14 r = requests.get(auth_url)
     15 jsn = r.json()
---> 16 action_url = jsn["methods"]["methods"]["password"]["config"]["action"]
     17 data = {"identifier": username, "password": password}
     18 headers = {"Accept": "application/json", "Content-Type": "application/json"}

KeyError: 'methods'

in the response you don't get the key "methods".

Response example

{
    "id": "bad96217-aac0-4456-8ae7-54467b4c3813",
    "type": "api",
    "expires_at": "2021-08-23T14:00:44.432803488Z",
    "issued_at": "2021-08-23T13:50:44.432803488Z",
    "request_url": "http://mlops.mydomain.com/self-service/login/api",
    "ui": {
        "action": "https://mlops.mydomain.com/.ory/kratos/public/self-service/login?flow=bad96217-aac0-4456-8ae4-54467b4c323e2",
        "method": "POST",
        "nodes": [
            {
                "type": "input",
                "group": "default",
                "attributes": {
                    "name": "csrf_token",
                    "type": "hidden",
                    "value": "",
                    "required": true,
                    "disabled": false
                },
                "messages": null,
                "meta": {}
            },
            {
                "type": "input",
                "group": "password",
                "attributes": {
                    "name": "password_identifier",
                    "type": "text",
                    "value": "",
                    "required": true,
                    "disabled": false
                },
                "messages": null,
                "meta": {
                    "label": {
                        "id": 1070004,
                        "text": "ID",
                        "type": "info"
                    }
                }
            },
            {
                "type": "input",
                "group": "password",
                "attributes": {
                    "name": "password",
                    "type": "password",
                    "required": true,
                    "disabled": false
                },
                "messages": null,
                "meta": {
                    "label": {
                        "id": 1070001,
                        "text": "Password",
                        "type": "info"
                    }
                }
            },
            {
                "type": "input",
                "group": "password",
                "attributes": {
                    "name": "method",
                    "type": "submit",
                    "value": "password",
                    "disabled": false
                },
                "messages": null,
                "meta": {
                    "label": {
                        "id": 1010001,
                        "text": "Sign in",
                        "type": "info",
                        "context": {}
                    }
                }
            }
        ]
    },
    "created_at": "2021-08-23T13:50:44.433949Z",
    "updated_at": "2021-08-23T13:50:44.433949Z",
    "forced": false
}
@bernardolk
Copy link
Contributor

bernardolk commented Aug 23, 2021

Thanks for pointing this out: We have recently upgraded our Ory Kratos module version, which has the info we need there under a different key. We will need to update the tutorial notebook code to reflect that.
You can use action_url = jsn["ui"]["action"]

@jonpoveda
Copy link

jonpoveda commented Aug 23, 2021

Good afternoon @bernardolk, I'm also following this tutorial and applying the change you mentioned leads me to another error I can't figure out how to solve. I've tried multiple things like passing flow as param, creating cookies, forcing refresh, passing a value that forcing refresh returns, .. with no sucess. I'm appending the request result after just getting the correct action_url:

{
   "id":"0fcc6629-2668-4af2-8824-c30396d7b173",
   "type":"api",
   "expires_at":"2021-08-23T16:00:06.279179Z",
   "issued_at":"2021-08-23T15:50:06.279179Z",
   "request_url":"http://mlops.mydomain.com/self-service/login/api",
   "ui":{
      "action":"https://mlops.mydomain.com/.ory/kratos/public/self-service/login?flow=0fcc6629-2668-4af2-8824-c30396d7b173",
      "method":"POST",
      "nodes":[
         {
            "type":"input",
            "group":"default",
            "attributes":{
               "name":"csrf_token",
               "type":"hidden",
               "value":"",
               "required":true,
               "disabled":false
            },
            "messages":"None",
            "meta":{
               
            }
         },
         {
            "type":"input",
            "group":"password",
            "attributes":{
               "name":"password_identifier",
               "type":"text",
               "value":"",
               "required":true,
               "disabled":false
            },
            "messages":"None",
            "meta":{
               "label":{
                  "id":1070004,
                  "text":"ID",
                  "type":"info"
               }
            }
         },
         {
            "type":"input",
            "group":"password",
            "attributes":{
               "name":"password",
               "type":"password",
               "required":true,
               "disabled":false
            },
            "messages":"None",
            "meta":{
               "label":{
                  "id":1070001,
                  "text":"Password",
                  "type":"info"
               }
            }
         },
         {
            "type":"input",
            "group":"password",
            "attributes":{
               "name":"method",
               "type":"submit",
               "value":"password",
               "disabled":false
            },
            "messages":"None",
            "meta":{
               "label":{
                  "id":1010001,
                  "text":"Sign in",
                  "type":"info",
                  "context":{
                     
                  }
               }
            }
         }
      ],
      "messages":[
         {
            "id":4010002,
            "text":"Could not find a strategy to log you in with. Did you fill out the form correctly?",
            "type":"error"
         }
      ]
   },
   "created_at":"2021-08-23T15:50:06.281029Z",
   "updated_at":"2021-08-23T15:50:06.281029Z",
   "forced":false
}```

@bernardolk
Copy link
Contributor

bernardolk commented Aug 23, 2021

I am sorry to hear you wasted so much time @jonpoveda! I know what your issue is. Ory also changed the way you need to send the credentials in their flow.
You need to update this line: data = {"identifier": <username>, "password": <pwd>} to data = {"password_identifier": <username>, "password": <pwd>, "method": "password"}
I am pretty sure those are the only changes regarding this issue. But if anything, let me know.

@Pfriasf
Copy link
Author

Pfriasf commented Aug 24, 2021

do you know if there is also any modification in the prefect api to create the project apparently it is waiting for a tenant id

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
/opt/conda/lib/python3.9/site-packages/prefect/client/client.py in _send_request(self, session, method, url, params, headers)
    374         try:
--> 375             response.raise_for_status()
    376         except requests.HTTPError as exc:

/opt/conda/lib/python3.9/site-packages/requests/models.py in raise_for_status(self)
    952         if http_error_msg:
--> 953             raise HTTPError(http_error_msg, response=self)
    954 

HTTPError: 400 Client Error: Bad Request for url: https://prefect.mlops.mydomain.com/graphql

The above exception was the direct cause of the following exception:

ClientError                               Traceback (most recent call last)
/tmp/ipykernel_166/3997178121.py in <module>
     18     flow_run_id = prefect_client.create_flow_run(flow_id=training_flow_id, run_name=f"run {prefect_project_name}")
     19 
---> 20 create_prefect_flow()

/tmp/ipykernel_166/3997178121.py in create_prefect_flow()
     14         data = fetch_data()
     15         train_model(data=data, mlflow_experiment_id=5, alpha=0.3, l1_ratio=0.3)
---> 16     prefect_client.create_project(project_name=prefect_project_name)
     17     training_flow_id = prefect_client.register(flow, project_name=prefect_project_name)
     18     flow_run_id = prefect_client.create_flow_run(flow_id=training_flow_id, run_name=f"run {prefect_project_name}")

/opt/conda/lib/python3.9/site-packages/prefect/client/client.py in create_project(self, project_name, project_description)
    969 
    970         try:
--> 971             res = self.graphql(
    972                 project_mutation,
    973                 variables=dict(

/opt/conda/lib/python3.9/site-packages/prefect/client/client.py in graphql(self, query, raise_on_error, headers, variables, token, retry_on_api_error)
    296             - ClientError if there are errors raised by the GraphQL mutation
    297         """
--> 298         result = self.post(
    299             path="",
    300             server=self.api_server,

/opt/conda/lib/python3.9/site-packages/prefect/client/client.py in post(self, path, server, headers, params, token, retry_on_api_error)
    211             - dict: Dictionary representation of the request made
    212         """
--> 213         response = self._request(
    214             method="POST",
    215             path=path,

/opt/conda/lib/python3.9/site-packages/prefect/client/client.py in _request(self, method, path, params, server, headers, token, retry_on_api_error)
    457         )
    458         session.mount("https://", requests.adapters.HTTPAdapter(max_retries=retries))
--> 459         response = self._send_request(
    460             session=session, method=method, url=url, params=params, headers=headers
    461         )

/opt/conda/lib/python3.9/site-packages/prefect/client/client.py in _send_request(self, session, method, url, params, headers)
    386                         "mutation but the response could not be parsed for more details"
    387                     )
--> 388                 raise ClientError(f"{exc}\n{graphql_msg}") from exc
    389 
    390             # Server-side and non-graphql errors will be raised without modification

ClientError: 400 Client Error: Bad Request for url: https://prefect.mlops.mydomain.com/graphql

The following error messages were provided by the GraphQL server:

    INTERNAL_SERVER_ERROR: Variable "$input" got invalid value null at
        "input.tenant_id"; Expected non-nullable type UUID! not to be null.

The GraphQL query was:

    mutation($input: create_project_input!) {
            create_project(input: $input) {
                id
        }
    }

The passed variables were:

    {"input": {"name": "wine-quality-project-test", "description": null, "tenant_id": null}}

we have solved this by updating the prefect version my current version is : '0.14.12' after updating '0.15.4' which accepts tenant id this allows me to move forward but then we get error again with the versions :

Failed to load and execute Flow's environment: StorageError("An error occurred while unpickling the flow:\n  TypeError('code() takes at most 15 arguments (16 given)')\nThis may be due to one of the following version mismatches between the flow build and execution environments:\n  - prefect: (flow built with '0.15.4', currently running with '0.14.6')\n  - python: (flow built with '3.9.5', currently running with '3.7.9')")

@bernardolk
Copy link
Contributor

bernardolk commented Aug 24, 2021

The second error I have not seen before. I would recommend that you stick with the previous version since in Open MLOps module definitions we are using an image with Prefect in that version. So if you wish to upgrade the version in the notebook you would also need to upgrade the image so the pods in your cluster run a matching version.

Then, you will need to fix the first error:

Bug: prefect-server-agent pod is in a “CrashLoopBackOff” with an error of Your Prefect Server instance has no tenants. Create a tenant with prefect server create-tenant.
This will raise a ClientError if we try to deploy a flow to prefect:

          The following error messages were provided by the GraphQL server:
          
            INTERNAL_SERVER_ERROR: Variable "$input" got invalid value null at
                 "input.tenant_id"; Expected non-nullable type UUID! not to be null.

Solution: This means that the apollo pod didn’t create a tenant. To do this you need to:

  1. Port-forward the apollo service kubectl port-forward svc/prefect-server-apollo 4200
  2. run: prefect backend server && prefect server create-tenant --name default --slug default
  3. Restart the agent pod (you can scale it down and up to restart)

@Pfriasf
Copy link
Author

Pfriasf commented Aug 25, 2021

I think that if a tenant exists, when querying in graphql I can see its name slug and id

{
  tenant {
    slug
    id
    name
  }
}

output

{
  "data": {
    "tenant": [
      {
        "slug": "default",
        "id": "84f30d5f-7b41-36f9-a121-123456781b0928",
        "name": "default"
      }
    ]
  }
}

However, I followed your instructions:
I ran the port- forward command :

kubectl -n prefect port-forward svc/prefect-server-apollo 4200

output

Forwarding from 127.0.0.1:4200 -> 4200
Forwarding from [::1]:4200 -> 4200

this is where I get stuck, maybe due to lack of knowledge.

where should I execute the command

prefect backend server && prefect server create-tenant --name default --slug default

I tried in the same terminal, but of course I can't because it's busy with the port-forward.

I also tried:

kubectl exec -n prefect svc/prefect-server-apollo -- prefect backend server && prefect server create-tenant --name default --slug default

@bernardolk
Copy link
Contributor

bernardolk commented Aug 25, 2021

@Pfriasf you should open a new terminal tab after port-forwarding, but make sure you have prefect installed locally. What is happening is that you are forwarding the port from the cluster to your local machine, then having prefect installed (via pip install) in your local machine, you can access the prefect server in your cluster from it.
So, do what you did and then in a new terminal tab, after pip installing prefect, execute the commands. See if that works, please.

@Pfriasf
Copy link
Author

Pfriasf commented Aug 25, 2021

after run pip install prefect i get the following error:

prefect: command not found

im using ubuntu

@bernardolk
Copy link
Contributor

There seems to be a problem with your pip installation. Do you have python installed?
This is the package, for reference:
https://pypi.org/project/prefect/

@NhatAnh
Copy link

NhatAnh commented Sep 6, 2021

Hi,
I got the same error:

ClientError: 400 Client Error: Bad Request for url: https://prefect.mlops.pixtavietnam.com/graphql

The following error messages were provided by the GraphQL server:

    INTERNAL_SERVER_ERROR: Variable "$input" got invalid value null at
        "input.tenant_id"; Expected non-nullable type UUID! not to be null.

The GraphQL query was:

    mutation($input: create_project_input!) {
            create_project(input: $input) {
                id
        }
    }

The passed variables were:

    {"input": {"name": "wine-quality-project", "description": null, "tenant_id": null}}

I followed the fix and it seems to work. the prefect-server-agent pod is running. When I check prefect dashboard, I can see there is 1 agent running. But when I run notebook, I still get the same error. How do I debug this? Thanks

@pedrocwb
Copy link
Contributor

pedrocwb commented Sep 6, 2021

@NhatAnh Can you share the request you made? Please make sure that python and prefect version running in your notebook instance is the same that is running in your prefect agent.

@NhatAnh
Copy link

NhatAnh commented Sep 6, 2021

@pedrocwb I'm just doing the tutorial https://github.com/datarevenue-berlin/OpenMLOps/blob/master/tutorials/basic-usage-of-jupyter-mlflow-and-prefect.md

I use OpenMLOps-AWS terraform scripts to set it up, so I think the versions should match.
I can give you the login to jupyter notebook in my setup, if that helps.

How do I check if if a tenant has been created or not?

@NhatAnh
Copy link

NhatAnh commented Sep 7, 2021

I tried to create prefect project in prefect dashboard. And in Jupyter notebook, only register the already existing project, but now I get error:

ImportError: cannot import name 'get_boto_client' from 'prefect.utilities.aws' (/opt/conda/lib/python3.9/site-packages/prefect/utilities/aws.py)

@NhatAnh
Copy link

NhatAnh commented Sep 7, 2021

How do I change python version of the jupyter notebook? It seems prefect does not work well with python 3.9

@bernardolk
Copy link
Contributor

bernardolk commented Sep 7, 2021

You would need to change the notebook image, which is specified in the singleuser option of the jupyterhub module. If you want to test with a different python version, though, you can try creating a virtual env in your machine, locally, with a different python version and install prefect client there. Should be faster. But I am not sure that's the issue, do you have any references to where they say Prefect doesn't play nice with Python 3.9?

@NhatAnh
Copy link

NhatAnh commented Sep 8, 2021

@omerfsen-gsnd
Copy link

omerfsen-gsnd commented Sep 29, 2021

It seems the tenant is created. But Apollo can't connect to Graphql server:

As you can see when installing helm chart of prefect there is a job ran to create a tenant id.

kubectl logs -n prefect prefect-server-create-tenant-job-xxxx (change pod name)
Tenant created with ID: 496e7a7e-7fad-44ca-9cb1xxxxxxxxxx

Also it seems the default url for /graphql/ is internet facing URL so it is uses ory kratos but since it is not authenticated it gets 401 error (auth failed). So we logon to prefect.domain/graphql/ using kratos but UI itself is not authenticated....

@vkocaman
Copy link

It seems the tenant is created. But Apollo can't connect to Graphql server:

As you can see when installing helm chart of prefect there is a job ran to create a tenant id.

kubectl logs -n prefect prefect-server-create-tenant-job-xxxx (change pod name)
Tenant created with ID: 496e7a7e-7fad-44ca-9cb1xxxxxxxxxx

Also it seems the default url for /graphql/ is internet facing URL so it is uses ory kratos but since it is not authenticated it gets 401 error (auth failed). So we logon to prefect.domain/graphql/ using kratos but UI itself is not authenticated....

@bernardolk can you shed some light on this issue ?

@omerfsen-gsnd
Copy link

Also one thing i found out that you always download latest version:

https://github.com/datarevenue-berlin/OpenMLOps/blob/master/modules/prefect-server/variables.tf#L19-L22

Though it is not for prefect/server docker image i have seen that image had an issue and i switched it to 0.15.4 version manually for tenant creation images (that is prefectVersionTag) as defined here:

https://github.com/PrefectHQ/server/blob/master/helm/prefect-server/values.yaml#L13-L17

Also Server Tags are different https://github.com/PrefectHQ/server/tags and is used with helm chart version.

https://github.com/PrefectHQ/server/blob/master/helm/prefect-server/values.yaml#L3-L11

I Also updated jupyterhub with a new docker image to install 0.15.4 of prefect python packages by updating

https://github.com/datarevenue-berlin/OpenMLOps/blob/master/docker/openmlops-notebook/Dockerfile#L4

but still getting same error. Any update greatly appreciated!

@bernardolk
Copy link
Contributor

Alright, so @vkocaman you are getting 401 status code when prefect tries to access the /graphql URL?
You should make sure that you are doing the auth steps correctly: see if the get_prefect_token function is returning a token for you and that you are using it to instantiate the Prefect client. Can you check and get back to me?
I did not understand what you mean with: "but UI itself is not authenticated...." What UI are you talking about?

@omerfsen-gsnd I will make sure to fix the Prefect version, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants