ETM is a new metric for the Text-to-SQL task. ETM calculates semantic accuracy with a lower rate of false positives than Execution accuracy and a lower rate of false negatives than Exact Set Matching. It is released along with several other state of the art model outputs. This repo contains all the code necessary for evaluation.
treeMath.py
is written in Python 3.12.
To run this evaluation you need gold and predicted txt files. Examples of these are linked in spider_dev, spider_test, cosql_dev, and bird_dev. In each of these folders,
gold.txt
: gold file where each line isgold SQL \t db_id
C3.txt
: C3 model predictionsDAIL.txt
: DAIL model predictionsDIN.txt
: DIN model predictionsRASAT+PICARD.txt
: RASAT+PICARD predictionsRESDSQL.txt
: RESDSQL predictionsSuperSQL.txt
: SuperSQL predictionscodeS.txt
: CodeS-7b predictions
First, download the database folders for spider (dev and test), cosql (only dev), and/or bird (dev). Save the database folders in their respective database folder (spider_dev/database/, spider_test/database/, cosql_dev/database/, bird_dev/database/)
Then, create a conda environment:
conda create -n "ETM" python=3.12.3
conda activate ETM
Install packages:
pip install -r requirements.txt
To run our script, use the following command:
python3 treeMatch.py --gold path/to/gold.txt --pred path/to/pred.txt --db path/to/database/folder/ --table path/to/tables.json
--gold
: gold txt file.
--pred
: predictions txt file.
--db
: directory of databases.
--table
: tables json file.
--etype
: Evaluation type (exe, treematch, or all). Default is all.
--verbose
: add if you want information like which rules are being applied on each comparison.