Skip to content
Change the repository type filter

All

    Repositories list

    • vivaria

      Public
      Vivaria is METR's tool for running evaluations and conducting agent elicitation research.
      TypeScript
      MIT License
      287822415Updated Feb 6, 2025Feb 6, 2025
    • Shell
      1070Updated Feb 6, 2025Feb 6, 2025
    • Public repository containing METR's DVC pipeline for eval data analysis
      Python
      3056Updated Feb 5, 2025Feb 5, 2025
    • Inspect: A framework for large language model evaluations
      Python
      MIT License
      179000Updated Feb 4, 2025Feb 4, 2025
    • METR Task Standard
      TypeScript
      MIT License
      3214163Updated Feb 3, 2025Feb 3, 2025
    • Python
      Other
      65832Updated Jan 30, 2025Jan 30, 2025
    • For of eval-analysis-public repo intended for agent PRs
      Python
      3000Updated Jan 28, 2025Jan 28, 2025
    • LLM training code for Databricks foundation models
      Python
      Apache License 2.0
      542000Updated Jan 28, 2025Jan 28, 2025
    • A Cookiecutter template for developing tasks according to the METR Task Standard
      TypeScript
      0100Updated Jan 22, 2025Jan 22, 2025
    • Python
      1032Updated Jan 21, 2025Jan 21, 2025
    • Python
      0000Updated Jan 21, 2025Jan 21, 2025
    • [ICLR 2024] SWE-bench: Can Language Models Resolve Real-world Github Issues?
      Python
      MIT License
      404001Updated Jan 15, 2025Jan 15, 2025
    • A Kubernetes sandbox environment for use with inspect_ai
      Python
      MIT License
      3000Updated Jan 14, 2025Jan 14, 2025
    • TeX
      Other
      78302Updated Jan 9, 2025Jan 9, 2025
    • Dockerfile
      0000Updated Dec 29, 2024Dec 29, 2024
    • Python
      0010Updated Dec 28, 2024Dec 28, 2024
    • .github

      Public
      0000Updated Nov 24, 2024Nov 24, 2024
    • nanoGPT

      Public
      The simplest, fastest repository for training/finetuning medium-sized GPTs.
      Python
      MIT License
      6.4k000Updated Nov 22, 2024Nov 22, 2024
    • SCSS
      MIT License
      4302Updated Nov 21, 2024Nov 21, 2024
    • Python
      0000Updated Nov 8, 2024Nov 8, 2024
    • Python
      1000Updated Nov 2, 2024Nov 2, 2024
    • pyhooks

      Public archive
      A library that METR agents use to communicate with Vivaria.
      Python
      1010Updated Sep 22, 2024Sep 22, 2024
    • vivaria-mentat

      Public archive
      Vivaria is METR's tool for running evaluations and conducting agent elicitation research.
      TypeScript
      MIT License
      28011Updated Sep 19, 2024Sep 19, 2024
    • task-template

      Public template
      TypeScript
      6923Updated Aug 6, 2024Aug 6, 2024