- 
                Notifications
    
You must be signed in to change notification settings  - Fork 2
 
Jaysahnan/gro 418 business lookup template #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -1,2 +1,4 @@ | ||
| **/.DS_Store | ||
| .DS_Store | ||
| .DS_Store | ||
| .env | ||
| cursorrules.md | ||
| .cursorrules | 
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,55 @@ | ||
| # Stagehand + Browserbase: Business Lookup with Agent | ||
| 
     | 
||
| ## AT A GLANCE | ||
| - Goal: Automate business registry searches using an autonomous AI agent with computer-use capabilities. | ||
| - Uses Stagehand Agent in CUA mode to navigate complex UI elements, apply filters, and extract structured business data. | ||
| - Demonstrates extraction with Pydantic schema validation for consistent data retrieval. | ||
| - Docs → https://docs.stagehand.dev/basics/agent | ||
| 
     | 
||
| ## GLOSSARY | ||
| - agent: create an autonomous AI agent that can execute complex multi-step tasks | ||
| Docs → https://docs.stagehand.dev/basics/agent#what-is-agent | ||
| - extract: extract structured data from web pages using natural language instructions | ||
| Docs → https://docs.stagehand.dev/basics/extract | ||
| 
     | 
||
| ## QUICKSTART | ||
| 1) python -m venv venv | ||
| 2) source venv/bin/activate # On Windows: venv\Scripts\activate | ||
| 3) pip install stagehand python-dotenv pydantic | ||
| 4) cp .env.example .env | ||
| 5) Add required API keys/IDs to .env | ||
| 6) python main.py | ||
| 
     | 
||
| ## EXPECTED OUTPUT | ||
| - Initializes Stagehand session with Browserbase | ||
| - Displays live session link for monitoring | ||
| - Navigates to SF Business Registry search page | ||
| - Agent searches for business using DBA Name filter | ||
| - Agent completes search and opens business details | ||
| - Extracts structured business information (DBA Name, Account Number, NAICS Code, etc.) | ||
| - Outputs extracted data as JSON | ||
| - Closes session cleanly | ||
| 
     | 
||
| ## COMMON PITFALLS | ||
| - "ModuleNotFoundError": ensure all dependencies are installed via pip | ||
| - Missing credentials: verify .env contains BROWSERBASE_PROJECT_ID, BROWSERBASE_API_KEY, and GOOGLE_API_KEY | ||
| - Google API access: ensure you have access to Google's gemini-2.5-computer-use-preview-10-2025 model | ||
| 
         There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. link to their dashboard here.  | 
||
| - Agent failures: check that the business name exists in the registry and that max_steps is sufficient for complex searches | ||
| - Import errors: activate your virtual environment if you created one | ||
| 
     | 
||
| ## USE CASES | ||
| • Business verification: Automate registration status checks, license validation, and compliance verification for multiple businesses. | ||
| • Data enrichment: Collect structured business metadata (NAICS codes, addresses, ownership) for research or CRM updates. | ||
| • Due diligence: Streamline background checks by autonomously searching and extracting business registration details from public registries. | ||
| 
     | 
||
| ## NEXT STEPS | ||
| • Parameterize search: Accept business names as command-line arguments or from a CSV file for batch processing. | ||
| • Expand extraction: Add support for additional fields like tax status, licenses, or historical registration changes. | ||
| • Multi-registry support: Extend agent to search across multiple city or state business registries with routing logic. | ||
| 
     | 
||
| ## HELPFUL RESOURCES | ||
| 📚 Stagehand Docs: https://docs.stagehand.dev/stagehand | ||
| 🎮 Browserbase: https://www.browserbase.com | ||
| 💡 Templates: https://github.com/browserbase/stagehand/tree/main/examples | ||
| 
         There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this links to nothing, check other templates too  | 
||
| 📧 Need help? [email protected] | ||
| 
     | 
||
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,127 @@ | ||
| # Stagehand + Browserbase: Business Lookup with Agent - See README.md for full documentation | ||
| 
     | 
||
| import os | ||
| import asyncio | ||
| import json | ||
| from dotenv import load_dotenv | ||
| from stagehand import Stagehand, StagehandConfig | ||
| from pydantic import BaseModel, Field | ||
| from typing import Optional | ||
| 
     | 
||
| # Load environment variables | ||
| load_dotenv() | ||
| 
     | 
||
| # Business search variables | ||
| business_name = "Jalebi Street" | ||
| 
     | 
||
| 
         There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. extra \n, we should use prettier or something for standardized templates  | 
||
| 
     | 
||
| async def main(): | ||
| print("Starting business lookup...") | ||
| 
     | 
||
| # Initialize Stagehand with Browserbase for cloud-based browser automation. | ||
| # Note: set verbose: 0 to prevent API keys from appearing in logs when handling sensitive data. | ||
| config = StagehandConfig( | ||
| env="BROWSERBASE", | ||
| api_key=os.environ.get("BROWSERBASE_API_KEY"), | ||
| project_id=os.environ.get("BROWSERBASE_PROJECT_ID"), | ||
| model_name="openai/gpt-4.1", | ||
| model_api_key=os.environ.get("OPENAI_API_KEY"), | ||
| browserbase_session_create_params={ | ||
| "project_id": os.environ.get("BROWSERBASE_PROJECT_ID"), | ||
| }, | ||
| verbose=1 # 0 = errors only, 1 = info, 2 = debug | ||
| # (When handling sensitive data like passwords or API keys, set verbose: 0 to prevent secrets from appearing in logs.) | ||
| # https://docs.stagehand.dev/configuration/logging | ||
| ) | ||
| 
     | 
||
| try: | ||
| # Use async context manager for automatic resource management | ||
| async with Stagehand(config) as stagehand: | ||
| 
         There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this code is a little weird, maybe just me but can have some others check this out There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nvm ignore this  | 
||
| # Initialize browser session to start automation. | ||
| print("Stagehand initialized successfully") | ||
| session_id = None | ||
| if hasattr(stagehand, 'session_id'): | ||
| session_id = stagehand.session_id | ||
| elif hasattr(stagehand, 'browserbase_session_id'): | ||
| session_id = stagehand.browserbase_session_id | ||
| 
     | 
||
| if session_id: | ||
| print(f"Live View Link: https://browserbase.com/sessions/{session_id}") | ||
| 
     | 
||
| page = stagehand.page | ||
| 
     | 
||
| # Navigate to SF Business Registry search page. | ||
| print("Navigating to SF Business Registry...") | ||
| await page.goto( | ||
| "https://data.sfgov.org/stories/s/Registered-Business-Lookup/k6sk-2y6w/", | ||
| wait_until="domcontentloaded", | ||
| timeout=60000 | ||
| ) | ||
| 
     | 
||
| # Create agent with computer use capabilities for autonomous business search. | ||
| # Using CUA mode allows the agent to interact with complex UI elements like filters and tables. | ||
| print("Creating Computer Use Agent...") | ||
| agent = stagehand.agent( | ||
| provider="google", | ||
| model="gemini-2.5-computer-use-preview-10-2025", | ||
| instructions="You are a helpful assistant that can use a web browser to search for business information.", | ||
| options={ | ||
| "api_key": os.getenv("GOOGLE_API_KEY"), | ||
| }, | ||
| ) | ||
| 
     | 
||
| print(f"Searching for business: {business_name}") | ||
| result = await agent.execute( | ||
| instruction=f'Find and look up the business "{business_name}" in the SF Business Registry. Use the DBA Name filter to search for "{business_name}", apply the filter, and click on the business row to view detailed information. Scroll towards the right to see the NAICS code.', | ||
| max_steps=30, | ||
| auto_screenshot=True | ||
| ) | ||
| 
     | 
||
| if not result.success: | ||
| raise Exception("Agent failed to complete the search") | ||
| 
     | 
||
| print("Agent completed search successfully") | ||
| 
     | 
||
| # Extract comprehensive business information after agent completes the search. | ||
| # Using structured schema ensures consistent data extraction even if page layout changes. | ||
| print("Extracting business information...") | ||
| 
     | 
||
| # Define schema using Pydantic | ||
| class BusinessInfo(BaseModel): | ||
| dba_name: str = Field(..., description="DBA Name") | ||
| ownership_name: Optional[str] = Field(None, description="Ownership Name") | ||
| business_account_number: str = Field(..., description="Business Account Number") | ||
| location_id: Optional[str] = Field(None, description="Location Id") | ||
| street_address: Optional[str] = Field(None, description="Street Address") | ||
| business_start_date: Optional[str] = Field(None, description="Business Start Date") | ||
| business_end_date: Optional[str] = Field(None, description="Business End Date") | ||
| neighborhood: Optional[str] = Field(None, description="Neighborhood") | ||
| naics_code: str = Field(..., description="NAICS Code") | ||
| naics_code_description: Optional[str] = Field(None, description="NAICS Code Description") | ||
| 
     | 
||
| business_info = await page.extract( | ||
| "Extract all visible business information including DBA Name, Ownership Name, Business Account Number, Location Id, Street Address, Business Start Date, Business End Date, Neighborhood, NAICS Code, and NAICS Code Description", | ||
| schema=BusinessInfo | ||
| ) | ||
| 
     | 
||
| print("Business information extracted:") | ||
| print(json.dumps(business_info.model_dump(), indent=2)) | ||
| 
     | 
||
| print("Session closed successfully") | ||
| 
     | 
||
| except Exception as error: | ||
| print(f"Error during business lookup: {error}") | ||
| raise | ||
| 
     | 
||
| 
     | 
||
| if __name__ == "__main__": | ||
| try: | ||
| asyncio.run(main()) | ||
| except Exception as err: | ||
| print(f"Error in business lookup: {err}") | ||
| print("Common issues:") | ||
| print(" - Check .env file has BROWSERBASE_PROJECT_ID and BROWSERBASE_API_KEY") | ||
| print(" - Verify GOOGLE_API_KEY is set for the agent") | ||
| print("Docs: https://docs.browserbase.com/stagehand") | ||
| exit(1) | ||
| 
     | 
||
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| # Stagehand + Browserbase: Business Lookup with Agent | ||
| 
     | 
||
| 
         There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same comments as the one in the python one, also just realized the python readme links to the TS repo which is not ideal either  | 
||
| ## AT A GLANCE | ||
| - Goal: Automate business registry searches using an autonomous AI agent with computer-use capabilities. | ||
| - Uses Stagehand Agent in CUA mode to navigate complex UI elements, apply filters, and extract structured business data. | ||
| - Demonstrates extraction with Zod schema validation for consistent data retrieval. | ||
| - Docs → https://docs.stagehand.dev/basics/agent | ||
| 
     | 
||
| ## GLOSSARY | ||
| - agent: create an autonomous AI agent that can execute complex multi-step tasks | ||
| Docs → https://docs.stagehand.dev/basics/agent#what-is-agent | ||
| - extract: extract structured data from web pages using natural language instructions | ||
| Docs → https://docs.stagehand.dev/basics/extract | ||
| 
     | 
||
| ## QUICKSTART | ||
| 1) npm install | ||
| 2) cp .env.example .env | ||
| 3) Add required API keys/IDs to .env | ||
| 4) npm start | ||
| 
     | 
||
| ## EXPECTED OUTPUT | ||
| - Initializes Stagehand session with Browserbase | ||
| - Displays live session link for monitoring | ||
| - Navigates to SF Business Registry search page | ||
| - Agent searches for business using DBA Name filter | ||
| - Agent completes search and opens business details | ||
| - Extracts structured business information (DBA Name, Account Number, NAICS Code, etc.) | ||
| - Outputs extracted data as JSON | ||
| - Closes session cleanly | ||
| 
     | 
||
| ## COMMON PITFALLS | ||
| - Dependency install errors: ensure npm install completed | ||
| - Missing credentials: verify .env contains BROWSERBASE_PROJECT_ID, BROWSERBASE_API_KEY, and GOOGLE_API_KEY | ||
| - Google API access: ensure you have access to Google's gemini-2.5-computer-use-preview-10-2025 model | ||
| - Agent failures: check that the business name exists in the registry and that maxSteps is sufficient for complex searches | ||
| 
     | 
||
| ## USE CASES | ||
| • Business verification: Automate registration status checks, license validation, and compliance verification for multiple businesses. | ||
| • Data enrichment: Collect structured business metadata (NAICS codes, addresses, ownership) for research or CRM updates. | ||
| • Due diligence: Streamline background checks by autonomously searching and extracting business registration details from public registries. | ||
| 
     | 
||
| ## NEXT STEPS | ||
| • Parameterize search: Accept business names as command-line arguments or from a CSV file for batch processing. | ||
| • Expand extraction: Add support for additional fields like tax status, licenses, or historical registration changes. | ||
| • Multi-registry support: Extend agent to search across multiple city or state business registries with routing logic. | ||
| 
     | 
||
| ## HELPFUL RESOURCES | ||
| 📚 Stagehand Docs: https://docs.stagehand.dev/stagehand | ||
| 🎮 Browserbase: https://www.browserbase.com | ||
| 💡 Templates: https://github.com/browserbase/stagehand/tree/main/examples | ||
| 📧 Need help? [email protected] | ||
| 
     | 
||
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,93 @@ | ||
| // Business Lookup with Agent - See README.md for full documentation | ||
| 
     | 
||
| import "dotenv/config"; | ||
| import { Stagehand } from "@browserbasehq/stagehand"; | ||
| import { z } from "zod"; | ||
| 
     | 
||
| // Business search variables | ||
| const businessName = "Jalebi Street"; | ||
| 
     | 
||
| async function main() { | ||
| // Initialize Stagehand with Browserbase for cloud-based browser automation. | ||
| const stagehand = new Stagehand({ | ||
| env: "BROWSERBASE", | ||
| verbose: 1, | ||
| model: "openai/gpt-4.1", | ||
| browserbaseSessionCreateParams: { | ||
| projectId: process.env.BROWSERBASE_PROJECT_ID!, | ||
| 
         There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we don't need to specify projectid if it's in our env vars already. we can have this here but we should be consistent and also pass in the api key  | 
||
| }, | ||
| }); | ||
| 
     | 
||
| try { | ||
| // Initialize browser session to start automation. | ||
| await stagehand.init(); | ||
| console.log("Stagehand initialized successfully!"); | ||
| console.log(`Live View Link: https://browserbase.com/sessions/${stagehand.browserbaseSessionId}`); | ||
| 
     | 
||
| const page = stagehand.context.pages()[0]; | ||
| 
     | 
||
| // Navigate to SF Business Registry search page. | ||
| console.log(`Navigating to SF Business Registry...`); | ||
| await page.goto("https://data.sfgov.org/stories/s/Registered-Business-Lookup/k6sk-2y6w/"); | ||
| 
     | 
||
| // Create agent with computer use capabilities for autonomous business search. | ||
| const agent = stagehand.agent({ | ||
| cua: true, // Enable Computer Use Agent mode | ||
| model: { | ||
| modelName: "google/gemini-2.5-computer-use-preview-10-2025", | ||
| apiKey: process.env.GOOGLE_GENERATIVE_AI_API_KEY | ||
| }, | ||
| systemPrompt: "You are a helpful assistant that can use a web browser to search for business information.", | ||
| }); | ||
| 
     | 
||
| console.log(`Searching for business: ${businessName}`); | ||
| const result = await agent.execute({ | ||
| instruction: `Find and look up the business "${businessName}" in the SF Business Registry. Use the DBA Name filter to search for "${businessName}", apply the filter, and click on the business row to view detailed information. Scroll towards the right to see the NAICS code.`, | ||
| maxSteps: 30, | ||
| }); | ||
| 
     | 
||
| if (!result.success) { | ||
| throw new Error("Agent failed to complete the search"); | ||
| } | ||
| 
     | 
||
| // Extract comprehensive business information after agent completes the search. | ||
| console.log("Extracting business information..."); | ||
| const businessInfo = await stagehand.extract( | ||
| "Extract all visible business information including DBA Name, Ownership Name, Business Account Number, Location Id, Street Address, Business Start Date, Business End Date, Neighborhood, NAICS Code, and NAICS Code Description", | ||
| z.object({ | ||
| dbaName: z.string(), | ||
| ownershipName: z.string().optional(), | ||
| businessAccountNumber: z.string(), | ||
| locationId: z.string().optional(), | ||
| streetAddress: z.string().optional(), | ||
| businessStartDate: z.string().optional(), | ||
| businessEndDate: z.string().optional(), | ||
| neighborhood: z.string().optional(), | ||
| naicsCode: z.string(), | ||
| naicsCodeDescription: z.string().optional(), | ||
| }), | ||
| { page }, | ||
| ); | ||
| 
     | 
||
| console.log("Business Information:"); | ||
| console.log(JSON.stringify(businessInfo, null, 2)); | ||
| 
     | 
||
| 
         There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. extra space  | 
||
| 
     | 
||
| } catch (error) { | ||
| console.error("Error during business lookup:", error); | ||
| } finally { | ||
| // Always close session to release resources and clean up. | ||
| await stagehand.close(); | ||
| console.log("Session closed successfully"); | ||
| } | ||
| } | ||
| 
     | 
||
| main().catch((err) => { | ||
| console.error("Error in business lookup:", err); | ||
| console.error("Common issues:"); | ||
| console.error(" - Check .env file has BROWSERBASE_PROJECT_ID and BROWSERBASE_API_KEY"); | ||
| console.error(" - Verify GOOGLE_API_KEY is set for the agent"); | ||
| console.error("Docs: https://docs.browserbase.com/stagehand"); | ||
| process.exit(1); | ||
| }); | ||
| 
     | 
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we pushing uv?