Skip to content

synq: auto-detect sqlite3 shell scripts so .read is not a syntax error#268

Draft
LalitMaganti wants to merge 3 commits into
mainfrom
lsp-shell-commands
Draft

synq: auto-detect sqlite3 shell scripts so .read is not a syntax error#268
LalitMaganti wants to merge 3 commits into
mainfrom
lsp-shell-commands

Conversation

@LalitMaganti
Copy link
Copy Markdown
Owner

@LalitMaganti LalitMaganti commented May 30, 2026

Closes #88.

The VSCode extension / LSP / CLI flagged sqlite3 CLI shell scripts —
files using .read foo.sql dot-commands, column-0 # comments, and
GO// terminators — with a spurious "syntax error near '.'" (issue
#88). Those constructs belong to the sqlite3 shell language, a layer
above the SQL library language, so the SQL parser correctly rejects
them; pragmatically we must handle them because such scripts are
ubiquitous.

Treat the shell language as a separate language, auto-detected per file
from its content, and reuse the existing embedded-SQL machinery
(EmbeddedFragment / EmbeddedAnalyzer / OffsetMap) with the roles flipped:
find non-SQL shell lines around SQL instead of SQL inside a host
language. Shell fragments are contiguous verbatim slices with no holes,
so offsets map back to host coordinates via a pure base offset.

  • embedded/shell.rs: new extract_shell + is_shell_script. A line is
    a dot-command (leading whitespace tolerated, only outside a pending
    statement), a column-0 # comment, a lone GO// terminator, blank,
    or SQL. The first dot-command or column-0 # switches the file into
    shell mode; GO// are terminators only in shell mode, so a stray
    GO in pure SQL stays a parse error. Handles CRLF line endings.
  • lsp/host.rs: ensure_analysis auto-detects shell scripts and routes
    the SQL fragments through the embedded analyzer, mapping diagnostics
    back to host offsets so shell lines never reach the SQL parser.
  • cli (analyze.rs / cli.rs): add a Shell host language and auto-detect
    shell scripts when no --experimental-lang is given.
  • Tests at three layers: shell extractor units (incl. CRLF + offset
    invariants), EmbeddedAnalyzer integration, and the LSP-host Add LSP/VSCode support for sqlite shell commands (e.g. .read) #88
    regression plus a stray-GO guard.

Semantic tokens / hover / go-to-definition for shell files are a
follow-up (TODO); this fixes the spurious parse error only.

The VSCode extension / LSP / CLI flagged sqlite3 CLI shell scripts —
files using `.read foo.sql` dot-commands, column-0 `#` comments, and
`GO`/`/` terminators — with a spurious "syntax error near '.'" (issue
#88). Those constructs belong to the sqlite3 shell language, a layer
above the SQL library language, so the SQL parser correctly rejects
them; pragmatically we must handle them because such scripts are
ubiquitous.

Treat the shell language as a separate language, auto-detected per file
from its content, and reuse the existing embedded-SQL machinery
(EmbeddedFragment / EmbeddedAnalyzer / OffsetMap) with the roles flipped:
find non-SQL shell lines *around* SQL instead of SQL *inside* a host
language. Shell fragments are contiguous verbatim slices with no holes,
so offsets map back to host coordinates via a pure base offset.

- embedded/shell.rs: new `extract_shell` + `is_shell_script`. A line is
  a dot-command (leading whitespace tolerated, only outside a pending
  statement), a column-0 `#` comment, a lone `GO`/`/` terminator, blank,
  or SQL. The first dot-command or column-0 `#` switches the file into
  shell mode; `GO`/`/` are terminators only in shell mode, so a stray
  `GO` in pure SQL stays a parse error. Handles CRLF line endings.
- lsp/host.rs: `ensure_analysis` auto-detects shell scripts and routes
  the SQL fragments through the embedded analyzer, mapping diagnostics
  back to host offsets so shell lines never reach the SQL parser.
- cli (analyze.rs / cli.rs): add a `Shell` host language and auto-detect
  shell scripts when no `--experimental-lang` is given.
- Tests at three layers: shell extractor units (incl. CRLF + offset
  invariants), EmbeddedAnalyzer integration, and the LSP-host #88
  regression plus a stray-`GO` guard.

Semantic tokens / hover / go-to-definition for shell files are a
follow-up (TODO); this fixes the spurious parse error only.
@LalitMaganti LalitMaganti marked this pull request as draft May 30, 2026 16:00
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 30, 2026

Playwright Reports

commit time link
c0ebc67 2026-05-30 16:04:09 UTC report
0387e26 2026-05-30 16:41:17 UTC report
abc45cd 2026-05-31 02:31:02 UTC report

…ests

Harden the sqlite3 shell-language support against the upstream CLI
documentation (https://sqlite.org/cli.html). The docs are explicit that
dot-commands must begin with "." at the left margin with no preceding
whitespace, so an indented `.read` is NOT a dot-command — it reaches the
SQL core and is a syntax error. The previous heuristic tolerated leading
whitespace; this aligns it with the documented behavior.

- embedded/shell.rs: classify_line now requires the dot at column 0
  (`raw.as_bytes().first() == Some(&b'.')`, mirroring the `#` rule),
  still gated on not being mid-statement. Updated doc comments and the
  lenient unit tests (indented `.read` is now SQL; indented `.read` no
  longer triggers shell-mode detection).
- embedded/mod.rs: flip the now-misnamed indented-dot detection test.
- tests/lsp_diff_tests/shell.py: new LSP diagnostics diff suite (13
  cases) covering every documented input-parsing rule observably end to
  end through the real `syntaqlite lsp` server.
- integration_tests/suites/analyze.py: 24 CLI `analyze` tests covering
  the same rules via auto-detection, exit codes, and error mapping.

Coverage spans every documented rule: dot-command column-0 requirement
(positive + negative), single-line and mid-statement (continuation)
rules, no-comment-in-dot-command, trailing-bare-semicolon, `GO`/`/`
terminators (case-insensitive, surrounding whitespace, shell-mode-only),
`;` statement separation, column-0 `#` comments (vs indented `#`), plus
consequences: correct line/col mapping, semantic diagnostics still flow,
CRLF handling, dot-only files, blank lines, and explicit
`--experimental-lang shell`.
sqlite3 shell scripts commonly end a SQL statement with a semicolon followed by an inline comment before the next dot-command. The shell extractor must recognize that as a complete statement so the dot-command stays shell syntax instead of being parsed as SQL.

- Track the last significant SQL byte outside strings and comments for shell pending-state detection.
- Add extractor and LSP diff regressions for semicolon-before-comment lines followed by dot-commands.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add LSP/VSCode support for sqlite shell commands (e.g. .read)

1 participant