diff --git a/01_querying_rest_api/README.md b/01_querying_rest_api/README.md new file mode 100644 index 0000000..719093c --- /dev/null +++ b/01_querying_rest_api/README.md @@ -0,0 +1,61 @@ +# Querying the Cortex Search REST API + +## Prerequisites + +Before you can run the script, ensure you have the following prerequisites installed: + +- Python 3.x +- pip (Python package installer) + Additionally, you must have access to a Snowflake account and the required permissions to query the Cortex Search Service at the specified database and schema. + +Installation +First, clone this repository to your local machine using git: + +``` +git clone https://github.com/snowflake-labs/cortex-search.git +cd cortex-search/01_querying_rest_api +``` + +Install the necessary Python dependencies by running: + +``` +pip install -r requirements.txt +``` + +## Key-pair auth configuration + +Additionally, you must generate a private key for JWT auth with Snowflake as described in [this document](https://docs.snowflake.com/user-guide/key-pair-auth#configuring-key-pair-authentication). + +**Note**: take note of the path to your generated RSA private key, e.g., `/path/to/my/rsa_key.p8` -- you will need to supply this as the `--private-key-path` parameter to query the service later from the command line, or list the path to the file from within a notebook. + +## Command line usage + +The `simple_query.py` example script can be executed from the command line. For instance: + +``` +python3 examples/simple_query.py -u https://my_org-my_account.us-west-2.aws.snowflakecomputing.com -s DB.SCHEMA.SERVICE_NAME -q "the sky is blue" -c "description,text" -l 10 -a my_account -k /path/to/my/rsa_key.p8 -n my_name +``` + +**Arguments:** + +- `-u`, `--url`: URL of the Snowflake instance. See [this guide](https://docs.snowflake.com/en/user-guide/admin-account-identifier#finding-the-organization-and-account-name-for-an-account) for finding your Account URL +- `-s`, `--qualified-service-name`: The fully-qualified Cortex Search Service name, in the format DATABASE.SCHEMA.SERVICE +- `-q`, `--query`: The search query string +- `-c`, `--columns`: Comma-separated list of columns to return in the results +- `-l`, `--limit`: The max number of results to return +- `-a`, `--account`: Snowflake account name. See [this guide](https://docs.snowflake.com/en/user-guide/admin-account-identifier#finding-the-organization-and-account-name-for-an-account) for finding your Account name +- `-k`, `--private-key-path`: Path to the RSA private key file for authentication. +- `-n`, `--user-name`: Username for the Snowflake account +- `-r`, `--role`: Role to use for the query. If provided, a session token scoped to this role will be created and used for authentication to the API. + +The `interactive_query.py` example provides an interactive CLI that demonstrates caching the JWT used for authentication between requests for better performance and implements retries when the JWT has expired. You can run it like the following: + +``` +python3 examples/interactive_query.py -u https://my_org-my_account.us-west-2.aws.snowflakecomputing.com -s DB.SCHEMA.SERVICE_NAME -c "description,text" -a my_account -k /path/to/my/rsa_key.p8 -n my_name +``` + +This will launch an interactive session, where you will be prompted repeatedly for search queries to your Cortex Search Service. + +## License + +Apache Version 2.0 diff --git a/examples/interactive_query.py b/01_querying_rest_api/examples/interactive_query.py similarity index 100% rename from examples/interactive_query.py rename to 01_querying_rest_api/examples/interactive_query.py diff --git a/examples/simple_query.py b/01_querying_rest_api/examples/simple_query.py similarity index 100% rename from examples/simple_query.py rename to 01_querying_rest_api/examples/simple_query.py diff --git a/requirements.txt b/01_querying_rest_api/requirements.txt similarity index 100% rename from requirements.txt rename to 01_querying_rest_api/requirements.txt diff --git a/src/__init__.py b/01_querying_rest_api/src/__init__.py similarity index 100% rename from src/__init__.py rename to 01_querying_rest_api/src/__init__.py diff --git a/src/utils/jwtutils.py b/01_querying_rest_api/src/utils/jwtutils.py similarity index 100% rename from src/utils/jwtutils.py rename to 01_querying_rest_api/src/utils/jwtutils.py diff --git a/src/utils/queryutils.py b/01_querying_rest_api/src/utils/queryutils.py similarity index 100% rename from src/utils/queryutils.py rename to 01_querying_rest_api/src/utils/queryutils.py diff --git a/02_querying_python_api/notebook_query.ipynb b/02_querying_python_api/notebook_query.ipynb new file mode 100644 index 0000000..b9c451e --- /dev/null +++ b/02_querying_python_api/notebook_query.ipynb @@ -0,0 +1,141 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "15a04599", + "metadata": {}, + "source": [ + "# Querying a Cortex Search Service with the Python API\n", + "This notebook shows a simple demo of querying a Cortex Search Service with the Python API. \n", + "The documentation for this query pattern can be [accessed here](https://docs.snowflake.com/LIMITEDACCESS/cortex-search/query-cortex-search-service).\n", + "\n", + "\n", + "## Prerequisites\n", + "To install the required packages in your python environment, run: \n", + " ``pip install snowflake snowflake-snowpark-python``\n", + "\n", + "\n", + "Note: Querying Cortex Search requires version 0.8.0 or later of the `snowflake` package." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8549cdb7-3b82-416b-b8d0-900732227212", + "metadata": {}, + "outputs": [], + "source": [ + "from snowflake.core import Root # snowflake >= 0.8.0\n", + "from snowflake.snowpark.session import Session\n", + "\n", + "# Set connection parameters\n", + "SNOWFLAKE_ACCOUNT = \"\"\n", + "SNOWFLAKE_USER = \"\"\n", + "SNOWFLAKE_PASSWORD = \"\"\n", + "SNOWFLAKE_WAREHOUSE = \"\"\n", + "\n", + "CORTEX_SEARCH_DATABASE = \"\"\n", + "CORTEX_SEARCH_SCHEMA = \"\"\n", + "CORTEX_SEARCH_SERVICE = \"\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1ae0e607-a9aa-400e-8bb7-a9b6945cccaf", + "metadata": {}, + "outputs": [], + "source": [ + "def make_session():\n", + " \"\"\"\n", + " Create Snowpark Session from connection parameters\n", + " \"\"\"\n", + " connection_parameters = {\n", + " \"user\": SNOWFLAKE_USER,\n", + " \"password\": SNOWFLAKE_PASSWORD,\n", + " # \"authenticator\": \"externalbrowser\",\n", + " \"account\": SNOWFLAKE_ACCOUNT,\n", + " \"warehouse\": SNOWFLAKE_WAREHOUSE,\n", + " \"database\": CORTEX_SEARCH_DATABASE,\n", + " \"schema\": CORTEX_SEARCH_SCHEMA\n", + " }\n", + "\n", + " return Session.builder.configs(connection_parameters).create()" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "00202c33-97e0-4860-b34a-fe599d3948b2", + "metadata": {}, + "outputs": [], + "source": [ + "# make session and root objects\n", + "session = make_session()\n", + "root = Root(session)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "65286440", + "metadata": {}, + "outputs": [], + "source": [ + "def query_cortex_search_service(svc, query, columns=[], _filter = {}, limit=5):\n", + " \"\"\"\n", + " Query the specified Cortex Search service with the specified query string and search parameters.\n", + " \"\"\"\n", + " db, schema = session.get_current_database(), session.get_current_schema()\n", + " \n", + " cortex_search_service = (\n", + " root\n", + " .databases[db]\n", + " .schemas[schema]\n", + " .cortex_search_services[svc]\n", + " )\n", + " \n", + " response = cortex_search_service.search(\n", + " query,\n", + " columns=columns,\n", + " filter=_filter,\n", + " limit=limit\n", + " )\n", + "\n", + " return response.results" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c2bb1d8d", + "metadata": {}, + "outputs": [], + "source": [ + "# query the service\n", + "print(query_cortex_search_service(svc=CORTEX_SEARCH_SERVICE, query=\"foo bar\"))" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "cs-quickstart", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.4" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/README.md b/README.md index 531660a..4438efb 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,12 @@ # Cortex Search -This repository contains example usage, including authentication, for the Cortex Search REST API, currently in Private Preview. The official preview documentation can be [found here](https://docs.snowflake.com/LIMITEDACCESS/cortex-search/cortex-search-overview). +This repository contains example usage, including authentication, for Cortex Search, currently in Private Preview. The official preview documentation can be [found here](https://docs.snowflake.com/LIMITEDACCESS/cortex-search/cortex-search-overview). + +This repository contains the following examples: +- [01. Querying the REST API](/01_querying_rest_api/README.md) +- [02. Querying the Python API](/02_querying_python_api/README.md) +- [03. Building an AI search app in Streamlit](/03_streamlit_ai_search/README.md) + ## Prerequisites diff --git a/examples/notebook_query.ipynb b/examples/notebook_query.ipynb deleted file mode 100644 index b020571..0000000 --- a/examples/notebook_query.ipynb +++ /dev/null @@ -1,274 +0,0 @@ -{ - "cells": [ - { - "cell_type": "code", - "execution_count": 30, - "id": "8549cdb7-3b82-416b-b8d0-900732227212", - "metadata": {}, - "outputs": [], - "source": [ - "from datetime import timedelta, timezone, datetime\n", - "import jwt\n", - "from cryptography.hazmat.primitives.serialization import load_pem_private_key\n", - "from cryptography.hazmat.primitives.serialization import Encoding\n", - "from cryptography.hazmat.primitives.serialization import PublicFormat\n", - "from cryptography.hazmat.backends import default_backend\n", - "import base64\n", - "from getpass import getpass\n", - "import hashlib\n", - "import requests\n", - "import json\n", - "import os\n", - "import pandas as pd\n", - "pd.options.display.max_colwidth = 1000\n", - "\n", - "# account parameters\n", - "SNOWFLAKE_ACCOUNT = \"\" # must be capitalized\n", - "SNOWFLAKE_USER = \"\" # must be capitalized\n", - "SNOWFLAKE_URL = \"https://org-acc.snowflakecomputing.com\"\n", - "PRIVATE_KEY_PATH = \"/path/to/your/rsa_key.p8\"\n", - "\n", - "# service parameters\n", - "CORTEX_SEARCH_DATABASE = \"\"\n", - "CORTEX_SEARCH_SCHEMA = \"\"\n", - "CORTEX_SEARCH_SERVICE = \"\"\n", - "\n", - "# columns to query in the service\n", - "COLUMNS = [\n", - " \"COL1\",\n", - " \"COL2\",\n", - "]" - ] - }, - { - "cell_type": "code", - "execution_count": 31, - "id": "4ab6c048-b34c-47f9-a032-aa20fadb91d6", - "metadata": {}, - "outputs": [], - "source": [ - "def generate_JWT_token():\n", - " \"\"\"\n", - " https://docs.snowflake.com/en/developer-guide/sql-api/authenticating#generating-a-jwt-in-python\n", - " Generate a valid JWT token from snowflake account name, user name, private key and private key passphrase.\n", - " \"\"\"\n", - " # Prompt for private key passphrase\n", - " def get_private_key_passphrase():\n", - " return getpass('Private Key Passphrase: ')\n", - "\n", - " # Generate encoded public key\n", - " with open(PRIVATE_KEY_PATH, 'rb') as pem_in:\n", - " pemlines = pem_in.read()\n", - " try:\n", - " private_key = load_pem_private_key(pemlines, None, default_backend())\n", - " except TypeError:\n", - " private_key = load_pem_private_key(pemlines, get_private_key_passphrase().encode(), default_backend())\n", - " public_key_raw = private_key.public_key().public_bytes(Encoding.DER, PublicFormat.SubjectPublicKeyInfo)\n", - " sha256hash = hashlib.sha256()\n", - " sha256hash.update(public_key_raw)\n", - " public_key_fp = 'SHA256:' + base64.b64encode(sha256hash.digest()).decode('utf-8')\n", - "\n", - " # Generate JWT payload\n", - " qualified_username = SNOWFLAKE_ACCOUNT + \".\" + SNOWFLAKE_USER\n", - " now = datetime.now(timezone.utc)\n", - " lifetime = timedelta(minutes=60)\n", - " payload = {\n", - " \"iss\": qualified_username + '.' + public_key_fp,\n", - " \"sub\": qualified_username,\n", - " \"iat\": now,\n", - " \"exp\": now + lifetime\n", - " }\n", - " return jwt.encode(payload, key=private_key, algorithm=\"RS256\")\n", - " \n", - "jwt_token = generate_JWT_token()\n", - "\n", - "headers = {\n", - " 'X-Snowflake-Authorization-Token-Type': 'KEYPAIR_JWT',\n", - " 'Content-Type': 'application/json',\n", - " 'Accept': 'application/json',\n", - " 'Authorization': f'Bearer {jwt_token}',\n", - "}" - ] - }, - { - "cell_type": "code", - "execution_count": 34, - "id": "1ae0e607-a9aa-400e-8bb7-a9b6945cccaf", - "metadata": {}, - "outputs": [], - "source": [ - "def query_service(query):\n", - " \"\"\"\n", - " Query the specified Snowflake service with the given query string.\n", - " \"\"\"\n", - " url = f\"{SNOWFLAKE_URL}/api/v2/cortex/search/databases/{CORTEX_SEARCH_DATABASE}/schemas/{CORTEX_SEARCH_SCHEMA}/services/{CORTEX_SEARCH_SERVICE}\"\n", - " data = {\n", - " \"query\": query,\n", - " \"columns\": COLUMNS,\n", - " \"filter\": \"\",\n", - " \"limit\": 10\n", - " }\n", - " \n", - " jwt_token = generate_JWT_token()\n", - " headers = {\n", - " 'X-Snowflake-Authorization-Token-Type': 'KEYPAIR_JWT',\n", - " 'Content-Type': 'application/json',\n", - " 'Accept': 'application/json',\n", - " 'Authorization': f'Bearer {jwt_token}',\n", - " }\n", - " \n", - " try:\n", - " response = requests.post(url, headers=headers, json=data)\n", - " response.raise_for_status()\n", - " except requests.exceptions.HTTPError as http_err:\n", - " print(f\"HTTP error occurred: {http_err} - Status code: {response.status_code}\")\n", - " except Exception as err:\n", - " print(f\"An error occurred: {err}\")\n", - " else:\n", - " return response.json()" - ] - }, - { - "cell_type": "code", - "execution_count": 37, - "id": "00202c33-97e0-4860-b34a-fe599d3948b2", - "metadata": {}, - "outputs": [ - { - "data": { - "text/html": [ - "
\n", - "\n", - "\n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - " \n", - "
TALK_IDSPEAKER_1TITLEDESCRIPTIONURL
0220Joseph LekutonA parable for KenyaJoseph Lekuton, a member of parliament in Kenya, starts with the story of his remarkable education, then offers a parable of how Africa can grow. His message of hope has never been more relevant.https://www.ted.com/talks/joseph_lekuton_a_parable_for_kenya/
12221Boniface MwangiThe day I stood up alonePhotographer Boniface Mwangi wanted to protest against corruption in his home country of Kenya. So he made a plan: He and some friends would stand up and heckle during a public mass meeting. But when the moment came ... he stood alone. What happened next, he says, showed him who he truly was. As he says, \"There are two most powerful days in your life. The day you are born, and the day you discover why.\" Graphic images.https://www.ted.com/talks/boniface_mwangi_the_day_i_stood_up_alone/
2523Erik HersmanReporting crisis via textingAt TEDU 2009, Erik Hersman presents the remarkable story of Ushahidi, a GoogleMap mashup that allowed Kenyans to report and track violence via cell phone texts following the 2008 elections, and has evolved to continue saving lives in other countries.https://www.ted.com/talks/erik_hersman_reporting_crisis_via_texting/
32653Charity WayuaA few ways to fix a governmentCharity Wayua put her skills as a cancer researcher to use on an unlikely patient: the government of her native Kenya. She shares how she helped her government drastically improve its process for opening up new businesses, a crucial part of economic health and growth, leading to new investments and a World Bank recognition as a top reformer.https://www.ted.com/talks/charity_wayua_a_few_ways_to_fix_a_government/
421033Mary MakerWhy I fight for the education of refugee girls (like me)After fleeing war-torn South Sudan as a child, Mary Maker found security and hope in the school at Kenya's Kakuma Refugee Camp. Now a teacher of young refugees herself, she sees education as an essential tool for rebuilding lives -- and empowering a generation of girls who are too often denied entrance into the classroom. \"For the child of war, an education can turn their tears of loss into a passion for peace,\" Maker says.https://www.ted.com/talks/mary_maker_why_i_fight_for_the_education_of_refugee_girls_like_me/
\n", - "
" - ], - "text/plain": [ - " TALK_ID SPEAKER_1 \\\n", - "0 220 Joseph Lekuton \n", - "1 2221 Boniface Mwangi \n", - "2 523 Erik Hersman \n", - "3 2653 Charity Wayua \n", - "4 21033 Mary Maker \n", - "\n", - " TITLE \\\n", - "0 A parable for Kenya \n", - "1 The day I stood up alone \n", - "2 Reporting crisis via texting \n", - "3 A few ways to fix a government \n", - "4 Why I fight for the education of refugee girls (like me) \n", - "\n", - " DESCRIPTION \\\n", - "0 Joseph Lekuton, a member of parliament in Kenya, starts with the story of his remarkable education, then offers a parable of how Africa can grow. His message of hope has never been more relevant. \n", - "1 Photographer Boniface Mwangi wanted to protest against corruption in his home country of Kenya. So he made a plan: He and some friends would stand up and heckle during a public mass meeting. But when the moment came ... he stood alone. What happened next, he says, showed him who he truly was. As he says, \"There are two most powerful days in your life. The day you are born, and the day you discover why.\" Graphic images. \n", - "2 At TEDU 2009, Erik Hersman presents the remarkable story of Ushahidi, a GoogleMap mashup that allowed Kenyans to report and track violence via cell phone texts following the 2008 elections, and has evolved to continue saving lives in other countries. \n", - "3 Charity Wayua put her skills as a cancer researcher to use on an unlikely patient: the government of her native Kenya. She shares how she helped her government drastically improve its process for opening up new businesses, a crucial part of economic health and growth, leading to new investments and a World Bank recognition as a top reformer. \n", - "4 After fleeing war-torn South Sudan as a child, Mary Maker found security and hope in the school at Kenya's Kakuma Refugee Camp. Now a teacher of young refugees herself, she sees education as an essential tool for rebuilding lives -- and empowering a generation of girls who are too often denied entrance into the classroom. \"For the child of war, an education can turn their tears of loss into a passion for peace,\" Maker says. \n", - "\n", - " URL \n", - "0 https://www.ted.com/talks/joseph_lekuton_a_parable_for_kenya/ \n", - "1 https://www.ted.com/talks/boniface_mwangi_the_day_i_stood_up_alone/ \n", - "2 https://www.ted.com/talks/erik_hersman_reporting_crisis_via_texting/ \n", - "3 https://www.ted.com/talks/charity_wayua_a_few_ways_to_fix_a_government/ \n", - "4 https://www.ted.com/talks/mary_maker_why_i_fight_for_the_education_of_refugee_girls_like_me/ " - ] - }, - "execution_count": 37, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "querystr = \"\"\n", - "\n", - "df = pd.DataFrame(query_service(querystr)[\"results\"])\n", - "df[COLUMNS].head()" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python3 (cs-quickstart)", - "language": "python", - "name": "cs-quickstart" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.12.2" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -}