Merge pull request #5258 from sysown/misc251219

Documentation additions and bug fix for vacuum_stats()
4 months ago · 88edaac61b
parent efd87ae4d4 cf8cbfd89a
commit 88edaac61b
5 changed files with 824 additions and 1 deletions
--- a/doc/coredump_filters_documentation.md
+++ b/doc/coredump_filters_documentation.md
@ -0,0 +1,262 @@
+# ProxySQL: On‑Demand Core Dump Generation (coredump_filters)
+
+## Introduction
+
+ProxySQL includes a debugging feature that allows on‑demand generation of core dump files when specific code locations are reached. This is useful for diagnosing rare or hard‑to‑reproduce bugs without requiring a full debug build or restarting the proxy.
+
+The feature works by:
+
+1. **Defining filters**: Inserting file‑name and line‑number pairs into the `coredump_filters` table.
+2. **Enabling filters**: Loading the filters to runtime with `LOAD COREDUMP TO RUNTIME`.
+3. **Triggering core dumps**: When the macro `generate_coredump()` is executed at a filtered location, a core file is written to disk (subject to rate‑limiting and platform constraints).
+
+Core dump generation is **rate‑limited** and **platform‑specific** (currently Linux on x86‑32, x86‑64, ARM, and MIPS architectures).
+
+---
+
+## Table Definitions
+
+### `coredump_filters` (persistent configuration)
+
+| Column   | Type         | Nullable | Primary Key | Description |
+|----------|--------------|----------|-------------|-------------|
+| filename | VARCHAR      | NOT NULL | Yes         | Source file name (as seen by the compiler) |
+| line     | INT          | NOT NULL | Yes         | Line number within that file |
+
+**Primary key**: (`filename`, `line`)
+
+**SQL definition**:
+```sql
+CREATE TABLE coredump_filters (
+    filename VARCHAR NOT NULL,
+    line INT NOT NULL,
+    PRIMARY KEY (filename, line)
+);
+```
+
+### `runtime_coredump_filters` (runtime state)
+
+This table mirrors the active filters currently loaded into memory. It is updated automatically when `LOAD COREDUMP TO RUNTIME` is executed.
+
+**SQL definition**:
+```sql
+CREATE TABLE runtime_coredump_filters (
+    filename VARCHAR NOT NULL,
+    line INT NOT NULL,
+    PRIMARY KEY (filename, line)
+);
+```
+
+---
+
+## Configuration Variables
+
+Two global variables control the rate‑limiting behavior:
+
+| Variable | Default | Range | Description |
+|----------|---------|-------|-------------|
+| `admin‑coredump_generation_threshold` | 10 | 1–500 | Maximum number of core files that can be generated during the lifetime of the ProxySQL process. |
+| `admin‑coredump_generation_interval_ms` | 30000 (30 seconds) | 0–INT_MAX | Minimum time between two consecutive core dump generations. A value of `0` disables the interval check. |
+
+**Notes**:
+- Both variables are stored in the `global_variables` table (admin database).
+- Changes take effect immediately when the variable is set (no need to `LOAD … TO RUNTIME`).
+- The threshold is a **global counter**; once reached, no further core dumps will be generated until the process restarts.
+- The interval is measured in **milliseconds**.
+
+---
+
+## Admin Commands
+
+### `LOAD COREDUMP TO RUNTIME`
+
+Reads the `coredump_filters` table and loads the filters into memory. After this command, any location matching a filter becomes active for core dump generation.
+
+**Aliases**:
+- `LOAD COREDUMP FROM MEMORY`
+- `LOAD COREDUMP FROM MEM`
+- `LOAD COREDUMP TO RUN`
+
+**Example**:
+```sql
+LOAD COREDUMP TO RUNTIME;
+```
+
+**Effect**:
+1. Clears the previous runtime filter set.
+2. Reads all rows from `coredump_filters`.
+3. Converts each row into a string `"filename:line"` and stores it in an internal hash set.
+4. If at least one filter exists, the global flag `coredump_enabled` is set to `true`.
+
+### `SAVE COREDUMP` (not implemented)
+
+As of this writing, there is **no `SAVE COREDUMP` command**. The runtime state (`runtime_coredump_filters`) is automatically updated when filters are loaded, but there is no built‑in command to persist runtime filters back to the `coredump_filters` table.
+
+If you need to copy the active filters back to the configuration table, you can do so manually:
+
+```sql
+INSERT INTO coredump_filters SELECT * FROM runtime_coredump_filters;
+```
+
+---
+
+## Important Notes
+
+- **Case‑sensitive filenames**: The `filename` column must match exactly the string returned by `__FILE__` (including relative path from the source root). The comparison is case‑sensitive.
+- **Runtime‑only behavior**: Filters loaded via `LOAD COREDUMP TO RUNTIME` are stored in memory only. They are lost when ProxySQL restarts. To make filters persistent, keep them in the `coredump_filters` table and reload after each restart.
+- **Instance‑specific**: Filters are local to the ProxySQL instance; there is no automatic synchronization across a cluster.
+- **Rate limiting**: The feature includes two safety limits (`admin‑coredump_generation_threshold` and `admin‑coredump_generation_interval_ms`) to prevent disk filling and performance degradation.
+- **Platform‑specific**: Core dump generation works only on Linux x86‑32, x86‑64, ARM, and MIPS architectures. On other platforms the macro logs a warning and does nothing.
+
+---
+
+## Usage Example
+
+### 1. Insert a filter
+
+Suppose you want to generate a core dump when the function `MySQL_Data_Stream::check_data_flow()` reaches line 485 (where `generate_coredump()` is called).
+
+First, find the exact file name as used in the source code. The macro `LOCATION()` expands to `__FILE__ ":" __LINE__`. For the file `lib/mysql_data_stream.cpp` line 485:
+
+```sql
+INSERT INTO coredump_filters (filename, line) VALUES ('lib/mysql_data_stream.cpp', 485);
+```
+
+### 2. (Optional) Adjust rate‑limiting variables
+
+Increase the threshold and shorten the interval if you expect to trigger the dump multiple times quickly:
+
+```sql
+SET admin-coredump_generation_threshold = 50;
+SET admin-coredump_generation_interval_ms = 1000;  -- 1 second
+```
+
+These changes take effect immediately.
+
+### 3. Load filters to runtime
+
+```sql
+LOAD COREDUMP TO RUNTIME;
+```
+
+### 4. Trigger the condition
+
+Cause the code path to reach the filtered location. In this example, you would need to create a MySQL data‑stream condition where data exists at both ends of the stream (a fatal error). When that happens, ProxySQL will log:
+
+```
+[INFO] Coredump filter location 'lib/mysql_data_stream.cpp:485' was hit.
+[INFO] Generating coredump file 'core.<pid>.<counter>'...
+[INFO] Coredump file 'core.<pid>.<counter>' was generated ...
+```
+
+### 5. Inspect the core file
+
+The core file is written in the current working directory of the ProxySQL process, with the name pattern `core.<pid>.<counter>` (e.g., `core.12345.0`). It is compressed using the **coredumper** library.
+
+Analyze it with `gdb`:
+
+```bash
+gdb /usr/bin/proxysql core.12345.0
+```
+
+---
+
+## Rate Limiting and Safety Features
+
+To prevent disk filling and performance impact, core dump generation is protected by two mechanisms:
+
+1. **Threshold limit**: The total number of core dumps generated during the process lifetime cannot exceed `admin‑coredump_generation_threshold` (default 10). Once the threshold is reached, `generate_coredump()` will still log the hit but will not write a new core file.
+
+2. **Interval limit**: After a core dump is written, at least `admin‑coredump_generation_interval_ms` milliseconds must pass before another core dump can be generated (unless the interval is set to 0). This prevents burst generation when a hot code path is repeatedly executed.
+
+Both counters are reset when `LOAD COREDUMP TO RUNTIME` is executed (or when `proxy_coredump_reset_stats()` is called internally).
+
+---
+
+## Platform Support
+
+Core dump generation is **only available** on the following platforms:
+
+- **Operating system**: Linux
+- **Architectures**: x86‑32 (`__i386__`), x86‑64 (`__x86_64__`), ARM (`__ARM_ARCH_3__`), MIPS (`__mips__`)
+
+On other platforms (e.g., FreeBSD, macOS, Windows) the `generate_coredump()` macro will log a warning and do nothing.
+
+The feature relies on the **coredumper** library (https://github.com/elastic/coredumper), which is bundled as a dependency.
+
+---
+
+## Internal Implementation Details
+
+### Macros and Functions
+
+- `generate_coredump()`: The macro used in the source code to conditionally generate a core dump. It checks `coredump_enabled` and looks up the current `__FILE__:__LINE__` in the filter set.
+- `proxy_coredump_load_filters()`: Loads a set of `"filename:line"` strings into the internal hash table.
+- `proxy_coredump_generate()`: Performs the actual core dump writing, subject to rate‑limiting checks.
+- `proxy_coredump_reset_stats()`: Resets the generation counter and last‑creation timestamp.
+
+### Database‑to‑Runtime Flow
+
+1. `LOAD COREDUMP TO RUNTIME` calls `ProxySQL_Admin::load_coredump_to_runtime()`.
+2. Which calls `flush_coredump_filters_database_to_runtime()`.
+3. Reads `coredump_filters` table, builds the string set, and passes it to `proxy_coredump_load_filters()`.
+4. The runtime state is mirrored to `runtime_coredump_filters` via `dump_coredump_filter_values_table()`.
+
+### Where `generate_coredump()` is Used
+
+Currently, the macro is placed in a few strategic “fatal error” locations:
+
+- `MySQL_Data_Stream::check_data_flow()` – when data exists at both ends of a MySQL data stream.
+- `PgSQL_Data_Stream::check_data_flow()` – analogous condition for PostgreSQL.
+
+Developers can add more `generate_coredump()` calls in other debug‑sensitive code sections.
+
+---
+
+## Troubleshooting
+
+### No core file is generated even though the filter was hit
+
+1. **Check platform support**: Verify ProxySQL is running on a supported Linux architecture.
+2. **Check rate‑limiting counters**: The global threshold may have been reached. Execute `LOAD COREDUMP TO RUNTIME` to reset the counters (or restart ProxySQL).
+3. **Check directory permissions**: The process must have write permission in the current working directory.
+4. **Check disk space**: Ensure there is sufficient free disk space.
+
+### Error “Coredump generation is not supported on this platform.”
+
+The platform is not among the supported architectures. Use a different machine or consider using a debugger instead.
+
+### Filters are not being activated after `LOAD COREDUMP TO RUNTIME`
+
+- Verify that the `filename` matches exactly the string that `__FILE__` expands to (relative path from the source root).
+- Ensure the line number is correct (check the source code for the exact line where `generate_coredump()` appears).
+- Inspect the `runtime_coredump_filters` table to confirm the filters were loaded.
+
+### High frequency of core dumps is affecting performance
+
+Increase `admin‑coredump_generation_interval_ms` to space out the generation, or reduce `admin‑coredump_generation_threshold` to limit the total number.
+
+---
+
+## Best Practices
+
+1. **Use for debugging only**: Enable coredump filters only during debugging sessions. Remove filters afterward to avoid unnecessary overhead.
+2. **Limit the threshold**: Keep `admin‑coredump_generation_threshold` low (e.g., 1‑5) unless you are investigating a recurring issue.
+3. **Set a reasonable interval**: A minimum interval of several seconds (e.g., 30000 ms) prevents storm generation.
+4. **Document filter locations**: Keep a record of why each filter was inserted and under what condition it triggers.
+5. **Monitor disk usage**: Core files can be large; ensure the working directory has enough space and consider a dedicated partition.
+
+---
+
+## Related Features
+
+- **Debug filters**: ProxySQL also supports `debug_filters` for enabling debug logs at specific file‑line locations.
+- **Core dump on crash**: For crash‑induced core dumps, use system‑level configuration (e.g., `ulimit -c unlimited`, `sysctl kernel.core_pattern`).
+
+---
+
+## Summary
+
+The `coredump_filters` feature provides a targeted, rate‑limited way to obtain core dumps from specific code locations without restarting ProxySQL or building a debug binary. It is a valuable tool for diagnosing elusive bugs in production‑like environments.
+
+Remember that core dump generation is **platform‑dependent** and **rate‑limited**; always verify support and adjust the configuration variables according to your debugging needs.
--- a/doc/killed_client_connections_documentation.md
+++ b/doc/killed_client_connections_documentation.md
@ -0,0 +1,215 @@
+# ProxySQL: Understanding "Closing killed client connection" Warnings
+
+## Introduction
+
+ProxySQL logs the message `"Closing killed client connection <IP>:<PORT>"` as a final cleanup step when a client session that has already been marked for termination is removed from a worker thread. This warning appears **only after** the session has been marked as killed (`killed=true`), and it does **not** indicate the **reason** for the kill.
+
+To diagnose why connections are being killed, you must look for earlier log entries that explain **why** the session was marked as killed in the first place.
+
+## Two‑Phase Kill Process
+
+ProxySQL handles client connection termination in two distinct phases:
+
+1. **Kill Decision Phase**: A condition triggers the session to be marked as killed (`killed=true`).
+   - **With warning**: Most timeout‑based mechanisms log a `"Killing client connection … because …"` warning at this point.
+   - **Without warning**: Explicit kill commands (admin interface, client KILL statements) set `killed=true` silently.
+
+2. **Cleanup Phase**: The worker thread detects `killed=true` and performs final cleanup, logging:
+   `"Closing killed client connection <IP>:<PORT>"`
+
+**Key Insight**: If you see **only** the `"Closing killed client connection …"` warning without a preceding `"Killing client connection because …"` message, the kill was triggered by an explicit command, **not** by a timeout.
+
+---
+
+## Kill Triggers That Log a Warning
+
+These mechanisms log a `"Killing client connection … because …"` warning **before** setting `killed=true`.
+If any of these are the cause, you will see **both** the reason warning **and** the final cleanup warning.
+
+### 1. **Idle Timeout** (`wait_timeout`)
+- **Trigger**: Client connection inactive (no queries) longer than `wait_timeout`.
+- **Warning**:
+  `"Killing client connection <IP>:<PORT> because inactive for <ms>ms"`
+- **Configuration**:
+  - `mysql‑wait_timeout` (global)
+  - Client‑specific `wait_timeout` (if set via `SET wait_timeout=…`)
+
+### 2. **Transaction Idle Timeout** (`max_transaction_idle_time`)
+- **Trigger**: A transaction has been started but remains idle (no statements executed) longer than `max_transaction_idle_time`.
+- **Warning**:
+  `"Killing client connection <IP>:<PORT> because of (possible) transaction idle for <ms>ms"`
+- **Configuration**: `mysql‑max_transaction_idle_time`
+
+### 3. **Transaction Running Timeout** (`max_transaction_time`)
+- **Trigger**: A transaction has been actively running (executing statements) longer than `max_transaction_time`.
+- **Warning**:
+  `"Killing client connection <IP>:<PORT> because of (possible) transaction running for <ms>ms"`
+- **Configuration**: `mysql‑max_transaction_time`
+
+### 4. **Fast‑Forward Mode with Offline Backends**
+- **Trigger**: Session is in `session_fast_forward` mode and all backends are OFFLINE.
+- **Warning**:
+  `"Killing client connection <IP>:<PORT> due to 'session_fast_forward' and offline backends"`
+- **Configuration**: `mysql‑session_fast_forward`
+
+---
+
+## Kill Triggers That Do **Not** Log a Warning
+
+These triggers set `killed=true` **without** logging a `"Killing client connection because …"` warning.
+You will see **only** the final `"Closing killed client connection …"` message.
+
+### 1. **Admin‑Initiated Kill**
+- **Trigger**: `KILL CONNECTION` command executed via the ProxySQL Admin interface.
+- **Path**: `MySQL_Threads_Handler::kill_session()` → sets `killed=true` directly.
+- **No warning logged** at kill decision time.
+
+### 2. **Kill‑Queue Request**
+- **Trigger**: `kill_connection_or_query()` called by:
+  - MySQL client `KILL CONNECTION` or `KILL QUERY` statements
+  - Admin‑interface kill commands (same as above)
+- **Path**:
+  `kill_connection_or_query()` → places request in per‑thread queue → `Scan_Sessions_to_Kill()` processes queue → sets `killed=true`.
+- **No warning logged** when `killed=true` is set.
+
+---
+
+## Log Analysis and Troubleshooting Guide
+
+### Scenario 1: You see **both** warnings
+```
+[WARNING] Killing client connection 192.168.123.45:56789 because inactive for 3600000ms
+[WARNING] Closing killed client connection 192.168.123.45:56789
+```
+**Diagnosis**: A timeout mechanism killed the connection.
+**Action**: Adjust the relevant timeout variable (`wait_timeout`, `max_transaction_idle_time`, etc.) or investigate why the client stayed idle so long.
+
+### Scenario 2: You see **only** the cleanup warning
+```
+[WARNING] Closing killed client connection 192.168.123.45:56789
+```
+**Diagnosis**: The connection was killed by an explicit command (admin or client KILL).
+**Action**:
+1. Check if any component (application, connection pool, admin script) is issuing `KILL CONNECTION` commands.
+2. Review ProxySQL admin logs for `KILL` commands.
+3. Check application logs for connection‑pool cleanup activities.
+
+### Scenario 3: Many cleanup warnings appear unexpectedly
+**Possible Causes**:
+1. **Connection‑pool cleanup**: Pools that aggressively close idle connections may issue `KILL` commands.
+2. **Admin automation**: Scripts or monitoring tools that kill “stuck” connections.
+3. **Client applications**: Applications that manually kill their own connections.
+
+**Investigation Steps**:
+1. **Enable audit logging**:
+   ```sql
+   SET mysql-auditlog_filename='/var/log/proxysql_audit.log';
+   LOAD MYSQL VARIABLES TO RUNTIME;
+   ```
+   Audit logs capture `KILL` commands with source IP and username.
+
+2. **Check `stats_mysql_commands_counters`**:
+   ```sql
+   SELECT * FROM stats_mysql_commands_counters WHERE Command='Kill';
+   ```
+   Shows how many `KILL` commands have been executed.
+
+3. **Monitor active kills**:
+   ```sql
+   SELECT * FROM stats_mysql_processlist WHERE info LIKE 'KILL%';
+   ```
+   Shows currently executing `KILL` commands.
+
+4. **Review client‑side logs**: Look for connection‑pool or application‑layer kill patterns.
+
+---
+
+## Configuration Parameters Reference
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `mysql‑wait_timeout` | 28800000 ms (8 hours) | Maximum idle time before connection is killed. |
+| `mysql‑max_transaction_idle_time` | 0 (disabled) | Maximum idle time for an open transaction. |
+| `mysql‑max_transaction_time` | 0 (disabled) | Maximum total time a transaction can run. |
+| `mysql‑session_fast_forward` | false | Enable fast‑forward mode (kills connections if backends go OFFLINE). |
+| `mysql‑throttle_max_transaction_time` | 0 (disabled) | Alternative to `max_transaction_time`; throttles instead of kills. |
+
+**Note**: Timeouts are expressed in **milliseconds**. A value of `0` disables the timeout.
+
+---
+
+## Best Practices for Monitoring and Handling
+
+### 1. **Differentiate Between Expected and Unexpected Kills**
+- **Expected**: Connection‑pool cleanup, scheduled maintenance, application‑controlled termination.
+- **Unexpected**: Unknown source of `KILL` commands, timeouts that are too aggressive for your workload.
+
+### 2. **Set Appropriate Timeouts**
+- **Production**: Align `wait_timeout` with your application’s connection‑pool `idleTimeout`.
+- **Transactions**: Enable `max_transaction_idle_time` and `max_transaction_time` to prevent runaway transactions.
+- **Testing**: Start with conservative values and adjust based on observed behavior.
+
+### 3. **Use Audit Logging for Forensic Analysis**
+```sql
+-- Enable audit logging
+SET mysql-auditlog_filename='/var/log/proxysql_audit.log';
+SET mysql-auditlog_filesize=1000000;
+SET mysql-auditlog=true;
+LOAD MYSQL VARIABLES TO RUNTIME;
+SAVE MYSQL VARIABLES TO DISK;
+```
+Audit logs record every `KILL` command with timestamp, client IP, and username.
+
+### 4. **Monitor Kill Statistics**
+```sql
+-- Track kill sources
+SELECT
+    SUM(CASE WHEN Command='Kill' THEN Total_Time_us ELSE 0 END) AS kill_time_us,
+    SUM(CASE WHEN Command='Kill' THEN cnt ELSE 0 END) AS kill_count
+FROM stats_mysql_commands_counters;
+
+-- Check for recent kills in the processlist
+SELECT * FROM stats_mysql_processlist
+WHERE info LIKE 'KILL%'
+ORDER BY time_ms DESC
+LIMIT 10;
+```
+
+### 5. **Responding to Unexpected Kill Storms**
+1. **Identify the source** via audit logs and `stats_mysql_commands_counters`.
+2. **If client applications**: Coordinate with developers to adjust connection‑pool settings.
+3. **If admin scripts**: Review automation logic and add appropriate guards.
+4. **If unknown**: Temporarily enable verbose logging (`mysql‑query_processor_log`) to capture more context.
+
+---
+
+## Common Questions and Answers
+
+### Q: Why don’t I see a “Killing client connection because …” warning?
+**A**: The kill was triggered by an explicit `KILL` command (admin or client), not by a timeout. Explicit kills do not log a reason at the moment `killed=true` is set.
+
+### Q: Can I disable the “Closing killed client connection” warnings?
+**A**: Yes, by lowering the log‑verbosity level. However, doing so removes visibility into connection termination. Instead, investigate and address the root cause of the kills.
+
+### Q: Are PostgreSQL connections handled differently?
+**A**: The same two‑phase pattern applies to PostgreSQL, with analogous timeout variables (`pgsql‑wait_timeout`, etc.) and kill mechanisms. The warning messages are similar but may appear in `PgSQL_Thread` instead of `MySQL_Thread`.
+
+### Q: How can I distinguish between a client `KILL` and an admin `KILL`?
+**A**: Audit logs show the source IP and username. Client `KILL` commands originate from application IPs; admin `KILL` commands come from the admin‑interface IP (usually `127.0.0.1` or your admin network).
+
+### Q: What should I do if kills are causing application errors?
+**A**:
+1. Verify timeout values match your application’s expected behavior.
+2. Ensure connection pools are configured to `KILL` connections gracefully (e.g., with `COM_QUIT` instead of `KILL CONNECTION`).
+3. Consider increasing timeouts temporarily while diagnosing.
+
+---
+
+## Summary
+
+The `"Closing killed client connection …"` warning is a **cleanup message**, not a **root‑cause indicator**. Diagnosing why connections are killed requires examining earlier logs for `"Killing client connection because …"` warnings or identifying explicit `KILL` commands via audit logs and statistics.
+
+- **Timeout kills** → preceded by a reason warning.
+- **Explicit kills** → no preceding reason warning.
+
+Use the troubleshooting steps and monitoring practices outlined above to identify the source of kills and adjust your configuration or application behavior accordingly.
--- a/doc/query_rules_groups_documentation.md
+++ b/doc/query_rules_groups_documentation.md
@ -0,0 +1,304 @@
+# ProxySQL Query Rules: Capture Groups and Backreferences
+
+## Introduction
+
+ProxySQL's query rules engine supports regular expression capture groups and backreferences, allowing sophisticated query rewriting. This document explains how to use these features to transform SQL queries dynamically.
+
+## Core Concepts
+
+### Table Columns for Pattern Matching
+
+| Column | Purpose | Example |
+|--------|---------|---------|
+| `digest` | Hash of the normalized query pattern | `0x1d2cc217c860282` |
+| `match_digest` | Normalized query pattern with placeholders | `SELECT * FROM users WHERE id = ?` |
+| `match_pattern` | Raw query pattern with regex groups | `SELECT (.*) FROM users WHERE id = (\d+)` |
+| `replace_pattern` | Replacement pattern with backreferences | `SELECT \1 FROM customers WHERE user_id = \2` |
+| `re_modifiers` | Regex modifiers | `'CASELESS'` or `'CASELESS,GLOBAL'` |
+
+### Regex Engine Support
+
+ProxySQL supports two regex engines (configurable via `mysql-query_processor_regex`):
+- **PCRE** (default, `query_processor_regex=1`): Full regex support including capture groups
+- **RE2** (`query_processor_regex=2`): Google's RE2 library, supports capture groups in replacement patterns
+
+Both engines support backreferences (`\1`, `\2`, etc.) in `replace_pattern`.
+
+## Basic Syntax
+
+### Capture Groups in match_pattern
+
+Use parentheses `()` to define capture groups:
+
+```sql
+-- Two capture groups: column list and WHERE clause
+INSERT INTO mysql_query_rules (
+    match_pattern, replace_pattern, apply
+) VALUES (
+    'SELECT (.*) FROM users WHERE (.*)',
+    'SELECT \1 FROM customers WHERE \2',
+    1
+);
+```
+
+### Backreferences in replace_pattern
+
+Reference captured groups with `\1`, `\2`, etc.:
+
+```sql
+-- \1 = column list, \2 = WHERE conditions
+'\1 FROM modified_table WHERE \2'
+```
+
+**Important**: Use single backslash (`\1`), not double (`\\1`), in the SQL INSERT statement.
+
+## Practical Examples
+
+### Example 1: Changing Table Names While Preserving Query Structure
+
+**Goal**: Rewrite queries from `old_table` to `new_table` while keeping all other parts unchanged.
+
+```sql
+INSERT INTO mysql_query_rules (
+    rule_id, active, match_pattern, replace_pattern, re_modifiers, apply
+) VALUES (
+    1, 1,
+    '(SELECT .* FROM )old_table( WHERE .*)',
+    '\1new_table\2',
+    'CASELESS',
+    1
+);
+```
+
+**Matches**: `SELECT id, name FROM old_table WHERE status = 'active'`
+**Becomes**: `SELECT id, name FROM new_table WHERE status = 'active'`
+
+### Example 2: Adding Hints to Specific Queries
+
+**Goal**: Add `FORCE INDEX (primary)` to SELECT queries on `orders` table with specific conditions.
+
+```sql
+INSERT INTO mysql_query_rules (
+    rule_id, active, digest, match_pattern, replace_pattern, apply
+) VALUES (
+    2, 1, '0x1234567890abcdef',
+    '(SELECT .* FROM orders)( WHERE customer_id = \d+.*)',
+    '\1 FORCE INDEX (primary)\2',
+    1
+);
+```
+
+**Matches**: `SELECT * FROM orders WHERE customer_id = 100 AND date > '2024-01-01'`
+**Becomes**: `SELECT * FROM orders FORCE INDEX (primary) WHERE customer_id = 100 AND date > '2024-01-01'`
+
+### Example 3: Column Renaming in SELECT Queries
+
+**Goal**: Rename column `legacy_id` to `new_id` in all SELECT queries.
+
+```sql
+INSERT INTO mysql_query_rules (
+    rule_id, active, match_pattern, replace_pattern, re_modifiers, apply
+) VALUES (
+    3, 1,
+    '(SELECT.*?)legacy_id(.*FROM.*)',
+    '\1new_id\2',
+    'CASELESS',
+    1
+);
+```
+
+**Matches**: `SELECT legacy_id, name FROM products WHERE category = 'electronics'`
+**Becomes**: `SELECT new_id, name FROM products WHERE category = 'electronics'`
+
+### Example 4: Conditional Rewriting Based on Values
+
+**Goal**: Add `USE INDEX` hint only for queries with `status = 'pending'`.
+
+```sql
+INSERT INTO mysql_query_rules (
+    rule_id, active, match_pattern, replace_pattern, re_modifiers, apply
+) VALUES (
+    4, 1,
+    '(SELECT .* FROM tasks)( WHERE.*status\s*=\s*''pending''.*)',
+    '\1 USE INDEX (idx_status)\2',
+    'CASELESS',
+    1
+);
+```
+
+**Matches**: `SELECT * FROM tasks WHERE status = 'pending' AND due_date < NOW()`
+**Becomes**: `SELECT * FROM tasks USE INDEX (idx_status) WHERE status = 'pending' AND due_date < NOW()`
+
+### Example 5: Complex Multi-Group Rewriting
+
+**Goal**: Reorder WHERE clause conditions and add optimizer hint.
+
+```sql
+INSERT INTO mysql_query_rules (
+    rule_id, active, match_pattern, replace_pattern, re_modifiers, apply
+) VALUES (
+    5, 1,
+    'SELECT (.*) FROM (\w+) WHERE (column1 = .*) AND (column2 = .*)',
+    'SELECT /*+ MAX_EXECUTION_TIME(1000) */ \1 FROM \2 WHERE \4 AND \3',
+    'CASELESS',
+    1
+);
+```
+
+**Matches**: `SELECT id, name FROM accounts WHERE column1 = 'value1' AND column2 = 'value2'`
+**Becomes**: `SELECT /*+ MAX_EXECUTION_TIME(1000) */ id, name FROM accounts WHERE column2 = 'value2' AND column1 = 'value1'`
+
+## Advanced Techniques
+
+### Combining digest and match_pattern
+
+For precise targeting, combine `digest` (hash of normalized query) with `match_pattern` (specific values):
+
+```sql
+INSERT INTO mysql_query_rules (
+    rule_id, active, digest, match_pattern, replace_pattern, apply
+) VALUES (
+    6, 1, '0xa1b2c3d4e5f67890',
+    '(SELECT .* FROM users)( WHERE id = 12345.*)',
+    '\1 FORCE INDEX (primary)\2',
+    1
+);
+```
+
+### Using re_modifiers
+
+- `CASELESS`: Case-insensitive matching
+- `GLOBAL`: Replace all occurrences (not just first)
+
+```sql
+INSERT INTO mysql_query_rules (
+    rule_id, active, match_pattern, re_modifiers, replace_pattern, apply
+) VALUES (
+    7, 1,
+    '(SELECT)(.*)(FROM)(.*)',
+    'CASELESS,GLOBAL',
+    '\1 SQL_NO_CACHE \2\3\4',
+    1
+);
+```
+
+### Rule Chaining with flagIN/flagOUT
+
+For complex transformations, chain rules using flags:
+
+```sql
+-- Rule 1: Match pattern and set flagOUT
+INSERT INTO mysql_query_rules (
+    rule_id, active, match_pattern, flagOUT, apply
+) VALUES (
+    8, 1, 'SELECT .* FROM sensitive_table', 100, 0
+);
+
+-- Rule 2: Apply transformation only when flagIN matches
+INSERT INTO mysql_query_rules (
+    rule_id, active, flagIN, match_pattern, replace_pattern, apply
+) VALUES (
+    9, 1, 100,
+    '(SELECT .* FROM )sensitive_table( WHERE .*)',
+    '\1audited_sensitive_table\2',
+    1
+);
+```
+
+## Testing and Validation
+
+### 1. Verify Rule Matching
+
+```sql
+-- Check stats for specific rule
+SELECT * FROM stats_mysql_query_rules WHERE rule_id = 1;
+
+-- Test pattern matching
+SELECT * FROM mysql_query_rules
+WHERE match_pattern = '(SELECT .* FROM )old_table( WHERE .*)';
+```
+
+### 2. Monitor Rule Performance
+
+```sql
+-- View hits and performance
+SELECT rule_id, hits, mysql_query_rules.match_pattern,
+       sum_time, min_time, max_time
+FROM stats_mysql_query_rules
+JOIN mysql_query_rules USING (rule_id)
+ORDER BY hits DESC;
+```
+
+
+## Common Pitfalls and Solutions
+
+### Problem 1: Groups Not Capturing Entire Needed Text
+
+**Symptom**: Replacement loses part of the original query.
+
+**Solution**: Expand capture groups to include more context:
+
+```sql
+-- Before (loses WHERE clause):
+'(SELECT .* FROM table)( WHERE)'
+
+-- After (captures entire WHERE clause):
+'(SELECT .* FROM table)( WHERE.*)'
+```
+
+### Problem 2: Backreferences Not Working
+
+**Symptom**: `\1` appears literally in output instead of replaced text.
+
+**Solution**: Ensure:
+1. Parentheses in `match_pattern` define capture groups
+2. `replace_pattern` uses `\1`, not `\\1` or `$1`
+3. Rule is active (`active = 1`)
+
+### Problem 3: Overly Broad Matching
+
+**Symptom**: Rule applies to unintended queries.
+
+**Solution**: Add more specific constraints:
+- Use `digest` column to restrict to specific query patterns
+- Add `username` or `schemaname` restrictions
+- Make `match_pattern` more specific
+
+```sql
+INSERT INTO mysql_query_rules (
+    active, username, digest, match_pattern, replace_pattern, apply
+) VALUES (
+    1, 'app_user', '0x1234567890abcdef',
+    '(SELECT .* FROM orders)( WHERE .*)',
+    '\1 FORCE INDEX (primary)\2',
+    1
+);
+```
+
+## Best Practices
+
+1. **Test Incrementally**: Start with simple patterns, then add complexity.
+2. **Use digest for Precision**: Combine `digest` with `match_pattern` for exact targeting.
+3. **Case-Insensitive by Default**: Use `re_modifiers = 'CASELESS'` unless case sensitivity is required.
+4. **Monitor Performance**: Regularly check `stats_mysql_query_rules` for rule hits and timing.
+5. **Document Rules**: Add comments to rules explaining their purpose:
+
+```sql
+INSERT INTO mysql_query_rules (
+    rule_id, active, match_pattern, replace_pattern, apply, comment
+) VALUES (
+    10, 1,
+    '(SELECT .* FROM )products( WHERE .*)',
+    '\1products_v2\2',
+    1,
+    'Rewrite: Route queries from products to products_v2 table'
+);
+```
+
+6. **Version Control**: Keep query rule definitions in version-controlled SQL files.
+
+## Conclusion
+
+ProxySQL's capture group and backreference capabilities provide powerful query rewriting options. By understanding how to properly structure `match_pattern` with parentheses and reference captured groups with `\1`, `\2` in `replace_pattern`, you can implement sophisticated query transformations while maintaining query correctness.
+
+Always test rules in a staging environment before deploying to production, and monitor their impact on query performance and correctness.
--- a/include/ProxySQL_Admin_Tables_Definitions.h
+++ b/include/ProxySQL_Admin_Tables_Definitions.h
@ -298,6 +298,19 @@
 #define STATS_SQLITE_TABLE_PGSQL_FREE_CONNECTIONS "CREATE TABLE stats_pgsql_free_connections (fd INT NOT NULL , hostgroup INT NOT NULL , srv_host VARCHAR NOT NULL , srv_port INT NOT NULL , user VARCHAR NOT NULL , database VARCHAR , init_connect VARCHAR , time_zone VARCHAR , sql_mode VARCHAR , idle_ms INT , statistics VARCHAR , pgsql_info VARCHAR)"
 #define STATS_SQLITE_TABLE_PGSQL_USERS "CREATE TABLE stats_pgsql_users (username VARCHAR PRIMARY KEY , frontend_connections INT NOT NULL , frontend_max_connections INT NOT NULL)"
 #define STATS_SQLITE_TABLE_PGSQL_PROCESSLIST "CREATE TABLE stats_pgsql_processlist (ThreadID INT NOT NULL , SessionID INTEGER PRIMARY KEY , user VARCHAR , database VARCHAR , cli_host VARCHAR , cli_port INT , hostgroup INT , l_srv_host VARCHAR , l_srv_port INT , srv_host VARCHAR , srv_port INT , backend_pid INT , backend_state VARCHAR , command VARCHAR , time_ms INT NOT NULL , info VARCHAR , status_flags INT , extended_info VARCHAR)"
+/**
+ * @brief PostgreSQL `pg_stat_activity`‑compatible view.
+ *
+ * This is a SQL VIEW, not a table. It provides a `pg_stat_activity`‑like
+ * interface to the data stored in `stats_pgsql_processlist`. Because it is a
+ * view, attempting to execute `DELETE` on it will fail with SQLite error:
+ * `"cannot modify stats_pgsql_stat_activity because it is a view"`.
+ *
+ * @note This view must be excluded from any bulk‑delete operations on
+ *       statistics tables (e.g., in `ProxySQL_Admin::vacuum_stats()`).
+ *       Deleting rows from the underlying `stats_pgsql_processlist` table
+ *       automatically clears the view's content.
+ */
 #define STATS_SQLITE_TABLE_PGSQL_STAT_ACTIVITY "CREATE VIEW stats_pgsql_stat_activity AS SELECT ThreadID AS thread_id, database AS datname, SessionID AS pid, user AS usename, cli_host AS client_addr, cli_port AS client_port, hostgroup, l_srv_host, l_srv_port, srv_host, srv_port, backend_pid, backend_state AS state, command, time_ms AS duration_ms, info as query, status_flags, extended_info FROM stats_pgsql_processlist"
 #define STATS_SQLITE_TABLE_PGSQL_ERRORS "CREATE TABLE stats_pgsql_errors (hostgroup INT NOT NULL , hostname VARCHAR NOT NULL , port INT NOT NULL , username VARCHAR NOT NULL , client_address VARCHAR NOT NULL , database VARCHAR NOT NULL , sqlstate VARCHAR NOT NULL , count_star INTEGER NOT NULL , first_seen INTEGER NOT NULL , last_seen INTEGER NOT NULL , last_error VARCHAR NOT NULL DEFAULT '' , PRIMARY KEY (hostgroup, hostname, port, username, database, sqlstate) )"
 #define STATS_SQLITE_TABLE_PGSQL_ERRORS_RESET "CREATE TABLE stats_pgsql_errors_reset (hostgroup INT NOT NULL , hostname VARCHAR NOT NULL , port INT NOT NULL , username VARCHAR NOT NULL , client_address VARCHAR NOT NULL , database VARCHAR NOT NULL , sqlstate VARCHAR NOT NULL , count_star INTEGER NOT NULL , first_seen INTEGER NOT NULL , last_seen INTEGER NOT NULL , last_error VARCHAR NOT NULL DEFAULT '' , PRIMARY KEY (hostgroup, hostname, port, username, database, sqlstate) )"
--- a/lib/ProxySQL_Admin.cpp
+++ b/lib/ProxySQL_Admin.cpp
@ -1888,6 +1888,35 @@ SQLite3_result * ProxySQL_Admin::generate_show_table_status(const char *tablenam
 template<typename S>
 void admin_session_handler(S* sess, void *_pa, PtrSize_t *pkt);

+/**
+ * @brief Delete all rows from statistics tables and vacuum the database.
+ *
+ * This function is called when `TRUNCATE` commands are executed on statistics
+ * tables via the Admin interface. It performs two operations:
+ * 1. Deletes all rows from a predefined list of statistics tables (and their
+ *    `*_reset` counterparts).
+ * 2. Executes `VACUUM` on the statistics database to reclaim space.
+ *
+ * The function respects the `variables.vacuum_stats` setting: if `false`,
+ * the function returns immediately without performing any operation.
+ *
+ * @param is_admin If `true`, operate on the `stats` schema within the admin
+ *                 database (`stats.*` tables). If `false`, operate on the
+ *                 standalone statistics database.
+ *
+ * @note The list of tables includes both MySQL and PostgreSQL statistics
+ *       tables, even when the trigger is a MySQL-specific `TRUNCATE`. This
+ *       ensures statistics are fully cleared regardless of the protocol that
+ *       initiated the operation.
+ *
+ * @warning The table `stats_pgsql_stat_activity` is explicitly excluded from
+ *          the deletion list because it is defined as a SQL VIEW (see
+ *          `STATS_SQLITE_TABLE_PGSQL_STAT_ACTIVITY`). Attempting to `DELETE`
+ *          from a view would cause a SQLite error:
+ *          `"cannot modify stats_pgsql_stat_activity because it is a view"`.
+ *          The view is based on `stats_pgsql_processlist`; clearing the
+ *          underlying table automatically clears the view's content.
+ */
 void ProxySQL_Admin::vacuum_stats(bool is_admin) {
 	if (variables.vacuum_stats==false) {
 		return;
@ -1905,7 +1934,7 @@ void ProxySQL_Admin::vacuum_stats(bool is_admin) {
 		"stats_pgsql_prepared_statements_info",
 		"stats_mysql_processlist",
 		"stats_pgsql_processlist",
-		"stats_pgsql_stat_activity",
+		//"stats_pgsql_stat_activity",  // VIEW, not a table; DELETE would fail
 		"stats_mysql_query_digest",
 		"stats_mysql_query_digest_reset",
 		"stats_pgsql_query_digest",