mirror of https://github.com/sysown/proxysql
fix(pgsql): typecast handler swallows the rest of the query (#5755)
`process_pg_typecast()` in lib/pgsql_tokenizer.cpp had two related
bugs that made any `::TYPENAME` cast in the middle of a PostgreSQL
query silently truncate the digest from the cast onward.
## Bug 1: incorrect exit condition
After consuming the type name (and optional modifiers / array
brackets) the function tested whether the next char was in an
enumerated delimiter list (`)`, `(`, `;`, `,`, `+`, `-`, ...) and
returned `st_no_mark_found` only if so. For any other char (the
common case: a letter starting the next SQL keyword, e.g. the `F`
of `FROM`), it returned `st_pg_typecast`, which the dispatcher
then handled by:
1. advancing the cursor by one extra char via `inc_proc_pos()`
2. re-entering `process_pg_typecast()` on the next outer-loop
iteration
Because `tc->started` was already `true`, the re-entry skipped the
`::` skip and started "parsing" a new type name from the middle of
the next clause — silently consuming `FROM "Inventory" AS i WHERE
((i."TenantId" = $1) AND NOT (i."IsDeleted"))` and producing
digests like `SELECT COUNT(*) AND ((((i."State" ...` that drop
~65 chars of FROM/WHERE structure.
Fix: the function unconditionally returns `st_no_mark_found` after
consuming the typecast. The earlier delimiter check was
attempting to distinguish "this is the end of the typecast" from
"this is a continuation", but no continuation case actually
reaches that point — type modifiers and array brackets are already
handled inline above the check, and PG does not allow the cast
itself to span a non-modifier/bracket character.
## Bug 2: `tc->started` never reset
`pg_typecast_st::started` is a one-shot guard for the `::`-skip
entry block. It was set to `true` on first entry but never reset,
so a second `::cast` later in the same query (`SELECT 1::int,
2::text`) re-entered with `started=true` and skipped the `::`
skip, eating two characters from the wrong position.
Fix: reset `started=false` when exiting via the new unconditional
`st_no_mark_found` return.
## Side-effect: trailing-whitespace handling
The post-type-name "skip whitespace" loop existed to handle modifier
forms like `::int (10)` and `::int []` where PG allows whitespace
before the modifier/bracket. But it also ate the legitimate
separator space in the common form `::int FROM ...`, which would
then glue the two tokens together in the digest (`int` consumed,
`FROM` glued to the previous token). The new lookahead-conditional
skip preserves the trailing space unless the very next non-space
character is a modifier `(` or array bracket `[`.
## Test
Added 3 regression tests (5 ok() assertions) to
`test/tap/tests/unit/pgsql_tokenizer_unit-t.cpp` registered in
`unit-tests-g1`:
- `test_digest_typecast_followed_by_clause` — full repro from the
issue, asserts `from` and `where` survive, asserts the
double-quoted identifier survives.
- `test_digest_typecast_then_identifier` — bisected minimal
`SELECT a::int FROM t`.
- `test_digest_typecast_multiple_in_same_query` — covers Bug 2 by
using two `::cast` instances followed by a `FROM` clause.
The existing 6 typecast tests (typecast_simple, _varchar,
_with_modifier, _array, _in_where, _quoted) and the
`digest_2_with_typecast` thread-local-wrapper test continue to pass
verbatim. Modifier edge cases (`::varchar (255)` with space,
`::numeric(10,2)`, `::int[][]`, `::"my type" FROM t`) verified
manually with a standalone reproducer.
Closes #5755
pull/5764/head
parent
9cc20a8775
commit
e6bef5c585
Loading…
Reference in new issue