Skip to content

MysteriousVoid/Industry-Benchmarking-Knowledge-Graph

Repository files navigation

Intelligent Business Knowledge Graph (IBKG)

A data pipeline and analysis system for processing and analyzing business metrics across multiple companies. This system helps organizations understand their performance metrics, optimize resource allocation, and make data-driven decisions.

Live Demo

A live version of this dashboard is available at:

πŸ‘‰ View the Live Demo on Render.com

🌟 Features

Core Features

  • Multi-Company Analysis: Process and compare metrics across multiple companies
  • Impact Analysis: Calculate the effect of budget changes on key metrics
  • Historical Trends: Track and analyze metric changes over time
  • Customer Segmentation: Analyze performance across different customer segments
  • Department Optimization: Optimize resource allocation across departments

Advanced Features

  • Real-time Processing: Process data as it comes in
  • Automated Validation: Ensure data quality and consistency
  • Custom Metrics: Define and track company-specific KPIs
  • Export Capabilities: Export data in multiple formats
  • API Integration: Connect with other business systems

πŸ“Š Data Structure

Raw Data

Located in src/data/pipeline/raw/, organized by company and data type:

raw/
β”œβ”€β”€ financial/          # Monthly financial metrics
β”‚   β”œβ”€β”€ Company_financial_2024.csv
β”‚   └── ...
β”œβ”€β”€ departments/        # Department budgets and targets
β”‚   β”œβ”€β”€ Company_budgets_2024.xlsx
β”‚   └── ...
β”œβ”€β”€ historical/         # Historical trends and metrics
β”‚   β”œβ”€β”€ Company_metrics_2023.csv
β”‚   └── ...
└── customers/         # Customer segmentation and behavior
    β”œβ”€β”€ Company_customers_2024.json
    └── ...

Data Formats

Financial Data (CSV)

date,revenue,expenses,profit,cac,ltv,churnRate,arpa
2024-01,6500000,4800000,1700000,2500,15000,0.008,1200
2024-02,6700000,4900000,1800000,2450,15200,0.007,1220

Department Budgets (JSON)

{
  "departments": [
    {
      "name": "Sales",
      "budget": 2000000,
      "headcount": 90,
      "target_revenue": 35000000,
      "tools_budget": 250000
    },
    {
      "name": "Marketing",
      "budget": 1500000,
      "headcount": 70,
      "target_leads": 40000,
      "tools_budget": 200000
    }
  ]
}

Customer Data (JSON)

{
  "customer_segments": {
    "enterprise": {
      "count": 300,
      "avg_contract_value": 800000,
      "churn_rate": 0.04,
      "expansion_rate": 0.35,
      "avg_tenure_months": 42
    },
    "mid_market": {
      "count": 600,
      "avg_contract_value": 200000,
      "churn_rate": 0.07,
      "expansion_rate": 0.28,
      "avg_tenure_months": 30
    }
  },
  "product_adoption": {
    "core_product": 0.95,
    "addon_1": 0.85,
    "addon_2": 0.75
  }
}

Processed Data

Located in src/data/companies/, each company has its own directory containing:

companies/
β”œβ”€β”€ Salesforce/
β”‚   └── impactMatrix.json
β”œβ”€β”€ HubSpot/
β”‚   └── impactMatrix.json
β”œβ”€β”€ Workday/
β”‚   └── impactMatrix.json
β”œβ”€β”€ ServiceNow/
β”‚   └── impactMatrix.json
└── Zoom/
    └── impactMatrix.json

πŸ› οΈ Setup

Prerequisites

  • Node.js (v14 or higher)
  • npm (v6 or higher)
  • Git
  • 4GB RAM minimum
  • 1GB free disk space

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/ibkg.git
cd ibkg
  1. Install dependencies:
npm install
  1. Required dependencies:
{
  "dependencies": {
    "csv-parse": "^5.5.3",
    "xlsx": "^0.18.5",
    "react": "^18.2.0",
    "react-dom": "^18.2.0",
    "vite": "^5.0.0"
  }
}

Environment Setup

  1. Create a .env file:
cp .env.example .env
  1. Configure environment variables:
NODE_ENV=development
PORT=3000
DATA_DIR=src/data
LOG_LEVEL=info

πŸš€ Usage

Running the Data Pipeline

Process all company data:

npm run process-data

Or run for specific companies:

node src/data/pipeline/runPipeline.js

Development

Start the development server:

npm start

Build for production:

npm run build

Run tests:

npm test

Common Commands

# Process specific company
npm run process-data -- --company=Salesforce

# Run with debug logging
npm run process-data -- --debug

# Generate reports
npm run generate-reports

# Validate data
npm run validate-data

πŸ“ˆ Data Processing

Financial Metrics

  • Revenue: Monthly revenue figures
    • Total revenue
    • Recurring revenue
    • One-time revenue
  • Expenses: Operating costs and investments
    • Fixed costs
    • Variable costs
    • R&D expenses
  • Profit: Net income after expenses
    • Gross profit
    • Operating profit
    • Net profit
  • CAC: Customer Acquisition Cost
    • Marketing CAC
    • Sales CAC
    • Total CAC
  • LTV: Customer Lifetime Value
    • Gross LTV
    • Net LTV
    • LTV/CAC ratio
  • Churn Rate: Customer attrition rate
    • Gross churn
    • Net churn
    • Revenue churn
  • ARPA: Average Revenue Per Account
    • Monthly ARPA
    • Annual ARPA
    • ARPA growth

Department Metrics

  • Budget Allocations: Department-wise budget distribution
    • Operating budget
    • Capital budget
    • Project budget
  • Headcount: Team size and composition
    • Full-time employees
    • Contractors
    • Team distribution
  • Target Metrics: Department-specific KPIs
    • Sales targets
    • Marketing goals
    • R&D milestones
  • Tool Budgets: Software and tool investments
    • SaaS tools
    • Development tools
    • Analytics tools

Impact Analysis

The system generates impact matrices showing how changes in:

  • Marketing Budget:
    • Effect on CAC (typically negative correlation)
    • Impact on revenue growth
    • Brand awareness metrics
    • Lead generation efficiency
  • Sales Budget:
    • Revenue generation efficiency
    • CAC optimization
    • Sales cycle length
    • Win rate improvement
  • R&D Budget:
    • Churn rate reduction
    • LTV improvement
    • Product adoption
    • Feature usage

πŸ“Š Current State

The system successfully processes data for all supported companies, generating impact matrices that reflect:

  1. Salesforce

    • Strong sales impact on revenue (0.41125)
    • Balanced marketing and R&D impacts
    • Enterprise-focused metrics
    • High customer retention
    • Strong upsell potential
  2. HubSpot

    • High marketing efficiency
    • Strong customer retention metrics
    • Mid-market optimization
    • Content marketing focus
    • High organic growth
  3. Workday

    • Enterprise-focused metrics
    • High LTV impact from R&D (0.32)
    • Strong HR/Finance focus
    • High implementation success
    • Strong partner ecosystem
  4. ServiceNow

    • Highest R&D impact on LTV (0.34)
    • Strong enterprise retention
    • IT service management focus
    • High platform adoption
    • Strong integration capabilities
  5. Zoom

    • High marketing impact on CAC reduction (-0.2925)
    • Strong sales revenue impact (0.41475)
    • High user engagement metrics
    • Strong freemium model
    • High viral growth

πŸ”§ Technical Details

Pipeline Architecture

src/data/pipeline/
β”œβ”€β”€ processData.js      # Main processing logic
β”œβ”€β”€ runPipeline.js      # Pipeline execution
β”œβ”€β”€ utils/             # Helper functions
β”‚   β”œβ”€β”€ validation.js  # Data validation
β”‚   β”œβ”€β”€ metrics.js     # Metric calculations
β”‚   └── export.js      # Data export
β”œβ”€β”€ models/            # Data models
β”‚   β”œβ”€β”€ Company.js     # Company model
β”‚   β”œβ”€β”€ Metrics.js     # Metrics model
β”‚   └── Impact.js      # Impact model
β”œβ”€β”€ processed/         # Intermediate results
└── raw/              # Input data

Processing Steps

  1. Data Validation

    • Format checking
    • Type validation
    • Range validation
    • Consistency checks
  2. Metric Calculation

    • Financial metrics
    • Operational metrics
    • Customer metrics
    • Department metrics
  3. Impact Analysis

    • Budget impact
    • Resource allocation
    • Performance metrics
    • Trend analysis
  4. Matrix Generation

    • Impact matrices
    • Correlation matrices
    • Trend matrices
    • Performance matrices
  5. Result Storage

    • JSON output
    • CSV export
    • Database storage
    • Cache management

Error Handling

  • Input Validation

    • File existence
    • Format validation
    • Data type checking
    • Range validation
  • Processing Errors

    • Calculation errors
    • Memory issues
    • Timeout handling
    • Resource limits
  • Output Validation

    • Result verification
    • Format checking
    • Consistency validation
    • Export validation

Performance Optimization

  • Parallel processing
  • Caching strategies
  • Memory management
  • Resource optimization

πŸ› Troubleshooting

Common Issues

  1. Data Processing Errors
Error: Invalid data format
Solution: Check file format and structure
  1. Memory Issues
Error: Out of memory
Solution: Increase Node.js memory limit
  1. File Access Issues
Error: Cannot read file
Solution: Check file permissions

Debug Mode

# Enable debug logging
DEBUG=* npm run process-data

# Show detailed errors
npm run process-data -- --verbose

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a Pull Request

Development Guidelines

  • Follow the existing code style
  • Add tests for new features
  • Update documentation
  • Ensure backward compatibility

Code Style

// Example of code style
function processData(data) {
  // Input validation
  if (!data) {
    throw new Error('Invalid input');
  }

  // Process data
  const result = data.map(item => ({
    ...item,
    processed: true
  }));

  return result;
}

Testing

# Run all tests
npm test

# Run specific test
npm test -- --grep="processData"

# Coverage report
npm run test:coverage

πŸ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ™ Acknowledgments

  • Inspired by the need for better financial benchmarking tools
  • Built with modern web technologies
  • Designed for CFOs and financial analysts
  • Community contributions and feedback

Screenshots

See below for examples of the dashboard in action. These screenshots illustrate the core features and user experience of the IBKG platform.

AI Insights and Knowledge Graph

AI Insights and Knowledge Graph

The left panel provides AI-powered insights, including trend predictions, anomaly detection, and actionable recommendations. The bottom-left bar chart compares the selected metric across industry peers, color-coded by performance. The right panel features an interactive knowledge graph, visualizing real relationships between departments (red) and business metrics (green) for the selected company. Users can drag nodes to explore dependencies and influences.

What-If Scenario

What-If Scenario

The What-If Scenario panel allows users to simulate the impact of changing a department's budget on key business metrics. Users can adjust the budget percentage and instantly see projected changes in revenue, costs, and other KPIs. Each metric is shown as a delta card, clearly displaying the current value, what-if value, and the percent change, color-coded for intuitive interpretation.

About

Industry Benchmarking Knowledge Graph

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published