Skip to content
Snippets Groups Projects
Code owners
Assign users and groups as approvers for specific file changes. Learn more.
README.MD 771 B

Text-to-scQL Dataset

This can be used to generate a dataset of sentences and corresponding queries.

The default is to use the following command to generate the dataset:

./generate.py

It will output three lines per test data:

  1. a sentence in English
  2. a scQL query
  3. a json representation of the expected outcome of the query

You need a credentials.json file of the form:

{ "uri": "uri to scql analyser", "username": "username", "password": "password"}

The script can also be used with the following arguments:

  • --validate can be used to validate the queries (check for errors, for instance).
  • --localhost to use an analyser server running locally
  • --no-model to generate training data without a model