Agent Skill: autorag-query¶
AutoRAG-Research ships with an agent skill that lets AI coding agents query pipeline results and metrics using natural language.
The skill follows the Vercel skills standard and works with Claude Code, Codex, Kiro, Cursor, and other compatible agents.
Installation¶
The skill is bundled at .agents/skills/autorag-query/ in the repository and is auto-detected by agents when you work inside the project.
To install globally (available across all projects):
npx skills add NomaDamas/AutoRAG-Research --skill autorag-query
How It Works¶
When you ask a data question, the agent:
- Reads the bundled database schema (
references/schema.sql) - Generates a SELECT-only SQL query
- Executes it via
scripts/query_executor.py - Returns formatted results (table / JSON / CSV)
Example:
You: "Which pipeline has the best BLEU score?"
Agent reads the schema, generates SQL, runs it, and replies: "hybrid_search_v2 achieved the highest BLEU score of 0.85."
What You Can Ask¶
- "Show me all pipelines and their types"
- "Which retrieval pipeline has the best recall?"
- "Compare token usage across generation pipelines"
- "What are the 5 worst-performing queries for BLEU?"
- "Show retrieval scores for query #42"
Query Executor Script¶
The skill includes a standalone script you can also run directly.
Basic Usage¶
uv run python .agents/skills/autorag-query/scripts/query_executor.py \
--query "SELECT name, pipeline_type FROM pipeline LIMIT 5" \
--config-path configs
Parameterized Queries¶
Use :param_name placeholders with --params for safe value substitution:
uv run python .agents/skills/autorag-query/scripts/query_executor.py \
--query "SELECT p.name, s.metric_result FROM summary s JOIN pipeline p ON s.pipeline_id = p.id JOIN metric m ON s.metric_id = m.id WHERE m.name = :metric_name ORDER BY s.metric_result DESC LIMIT 3" \
--config-path configs \
--params '{"metric_name": "rouge"}'
Output Formats¶
# JSON output
uv run python .agents/skills/autorag-query/scripts/query_executor.py \
--query "SELECT name, metric_type FROM metric" \
--config-path configs \
--format json
# CSV output
uv run python .agents/skills/autorag-query/scripts/query_executor.py \
--query "SELECT name FROM pipeline" \
--config-path configs \
--format csv
Options¶
| Flag | Description | Default |
|---|---|---|
--query, -q |
SQL query (SELECT only, required) | - |
--format, -f |
Output format: table, json, csv |
table |
--config-path, -c |
Path to configs/ directory containing db.yaml |
env vars fallback |
--params, -p |
JSON parameters for :param placeholders |
- |
--timeout, -t |
Query timeout in seconds | 10 |
--limit, -l |
Max rows returned (0 = unlimited) | 10000 |
--database, -d |
Database name override | from config |
Connection¶
The script auto-detects the database connection:
- Config file (if
--config-pathis provided): Readsdb.yamlfrom the specified directory - Environment variables (fallback): Uses
POSTGRES_HOST,POSTGRES_PORT,POSTGRES_USER,POSTGRES_PASSWORD,POSTGRES_DB
Safety¶
- Only
SELECTstatements are allowed (DDL/DML keywords are rejected) - Dangerous PostgreSQL functions are blocked (
pg_read_file,pg_execute,COPY, etc.) - Results are capped at 10,000 rows by default (enforced via subquery wrapper)
- Query timeout defaults to 10 seconds
- Engine connections are disposed after each execution
Query Templates¶
The skill bundles 20+ query templates in references/common-queries.md, organized by use case:
- Pipeline comparison: Top pipelines by metric, multi-metric pivot tables
- Per-query analysis: Score breakdowns, ground truth comparison, worst-performing queries
- Retrieval results: Retrieved chunks with scores, recall calculation
- Token usage: Per-pipeline totals, most expensive queries, usage over time
- Execution performance: Slowest queries, average execution time by pipeline
- JSONB extraction:
token_usage,config, andresult_metadatapatterns
Skill Directory Structure¶
.agents/skills/autorag-query/
├── SKILL.md # Skill definition (auto-detected by agents)
├── references/
│ ├── schema.sql # Complete database schema
│ └── common-queries.md # 20+ curated query templates
└── scripts/
└── query_executor.py # Safe SQL execution script