Merge pull request #5258 from sysown/misc251219

Documentation additions and bug fix for vacuum_stats()
pull/5257/head
René Cannaò 4 months ago committed by GitHub
commit 88edaac61b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

@ -0,0 +1,262 @@
# ProxySQL: OnDemand Core Dump Generation (coredump_filters)
## Introduction
ProxySQL includes a debugging feature that allows ondemand generation of core dump files when specific code locations are reached. This is useful for diagnosing rare or hardtoreproduce bugs without requiring a full debug build or restarting the proxy.
The feature works by:
1. **Defining filters**: Inserting filename and linenumber pairs into the `coredump_filters` table.
2. **Enabling filters**: Loading the filters to runtime with `LOAD COREDUMP TO RUNTIME`.
3. **Triggering core dumps**: When the macro `generate_coredump()` is executed at a filtered location, a core file is written to disk (subject to ratelimiting and platform constraints).
Core dump generation is **ratelimited** and **platformspecific** (currently Linux on x8632, x8664, ARM, and MIPS architectures).
---
## Table Definitions
### `coredump_filters` (persistent configuration)
| Column | Type | Nullable | Primary Key | Description |
|----------|--------------|----------|-------------|-------------|
| filename | VARCHAR | NOT NULL | Yes | Source file name (as seen by the compiler) |
| line | INT | NOT NULL | Yes | Line number within that file |
**Primary key**: (`filename`, `line`)
**SQL definition**:
```sql
CREATE TABLE coredump_filters (
filename VARCHAR NOT NULL,
line INT NOT NULL,
PRIMARY KEY (filename, line)
);
```
### `runtime_coredump_filters` (runtime state)
This table mirrors the active filters currently loaded into memory. It is updated automatically when `LOAD COREDUMP TO RUNTIME` is executed.
**SQL definition**:
```sql
CREATE TABLE runtime_coredump_filters (
filename VARCHAR NOT NULL,
line INT NOT NULL,
PRIMARY KEY (filename, line)
);
```
---
## Configuration Variables
Two global variables control the ratelimiting behavior:
| Variable | Default | Range | Description |
|----------|---------|-------|-------------|
| `admincoredump_generation_threshold` | 10 | 1500 | Maximum number of core files that can be generated during the lifetime of the ProxySQL process. |
| `admincoredump_generation_interval_ms` | 30000 (30 seconds) | 0INT_MAX | Minimum time between two consecutive core dump generations. A value of `0` disables the interval check. |
**Notes**:
- Both variables are stored in the `global_variables` table (admin database).
- Changes take effect immediately when the variable is set (no need to `LOAD … TO RUNTIME`).
- The threshold is a **global counter**; once reached, no further core dumps will be generated until the process restarts.
- The interval is measured in **milliseconds**.
---
## Admin Commands
### `LOAD COREDUMP TO RUNTIME`
Reads the `coredump_filters` table and loads the filters into memory. After this command, any location matching a filter becomes active for core dump generation.
**Aliases**:
- `LOAD COREDUMP FROM MEMORY`
- `LOAD COREDUMP FROM MEM`
- `LOAD COREDUMP TO RUN`
**Example**:
```sql
LOAD COREDUMP TO RUNTIME;
```
**Effect**:
1. Clears the previous runtime filter set.
2. Reads all rows from `coredump_filters`.
3. Converts each row into a string `"filename:line"` and stores it in an internal hash set.
4. If at least one filter exists, the global flag `coredump_enabled` is set to `true`.
### `SAVE COREDUMP` (not implemented)
As of this writing, there is **no `SAVE COREDUMP` command**. The runtime state (`runtime_coredump_filters`) is automatically updated when filters are loaded, but there is no builtin command to persist runtime filters back to the `coredump_filters` table.
If you need to copy the active filters back to the configuration table, you can do so manually:
```sql
INSERT INTO coredump_filters SELECT * FROM runtime_coredump_filters;
```
---
## Important Notes
- **Casesensitive filenames**: The `filename` column must match exactly the string returned by `__FILE__` (including relative path from the source root). The comparison is casesensitive.
- **Runtimeonly behavior**: Filters loaded via `LOAD COREDUMP TO RUNTIME` are stored in memory only. They are lost when ProxySQL restarts. To make filters persistent, keep them in the `coredump_filters` table and reload after each restart.
- **Instancespecific**: Filters are local to the ProxySQL instance; there is no automatic synchronization across a cluster.
- **Rate limiting**: The feature includes two safety limits (`admincoredump_generation_threshold` and `admincoredump_generation_interval_ms`) to prevent disk filling and performance degradation.
- **Platformspecific**: Core dump generation works only on Linux x8632, x8664, ARM, and MIPS architectures. On other platforms the macro logs a warning and does nothing.
---
## Usage Example
### 1. Insert a filter
Suppose you want to generate a core dump when the function `MySQL_Data_Stream::check_data_flow()` reaches line 485 (where `generate_coredump()` is called).
First, find the exact file name as used in the source code. The macro `LOCATION()` expands to `__FILE__ ":" __LINE__`. For the file `lib/mysql_data_stream.cpp` line 485:
```sql
INSERT INTO coredump_filters (filename, line) VALUES ('lib/mysql_data_stream.cpp', 485);
```
### 2. (Optional) Adjust ratelimiting variables
Increase the threshold and shorten the interval if you expect to trigger the dump multiple times quickly:
```sql
SET admin-coredump_generation_threshold = 50;
SET admin-coredump_generation_interval_ms = 1000; -- 1 second
```
These changes take effect immediately.
### 3. Load filters to runtime
```sql
LOAD COREDUMP TO RUNTIME;
```
### 4. Trigger the condition
Cause the code path to reach the filtered location. In this example, you would need to create a MySQL datastream condition where data exists at both ends of the stream (a fatal error). When that happens, ProxySQL will log:
```
[INFO] Coredump filter location 'lib/mysql_data_stream.cpp:485' was hit.
[INFO] Generating coredump file 'core.<pid>.<counter>'...
[INFO] Coredump file 'core.<pid>.<counter>' was generated ...
```
### 5. Inspect the core file
The core file is written in the current working directory of the ProxySQL process, with the name pattern `core.<pid>.<counter>` (e.g., `core.12345.0`). It is compressed using the **coredumper** library.
Analyze it with `gdb`:
```bash
gdb /usr/bin/proxysql core.12345.0
```
---
## Rate Limiting and Safety Features
To prevent disk filling and performance impact, core dump generation is protected by two mechanisms:
1. **Threshold limit**: The total number of core dumps generated during the process lifetime cannot exceed `admincoredump_generation_threshold` (default 10). Once the threshold is reached, `generate_coredump()` will still log the hit but will not write a new core file.
2. **Interval limit**: After a core dump is written, at least `admincoredump_generation_interval_ms` milliseconds must pass before another core dump can be generated (unless the interval is set to 0). This prevents burst generation when a hot code path is repeatedly executed.
Both counters are reset when `LOAD COREDUMP TO RUNTIME` is executed (or when `proxy_coredump_reset_stats()` is called internally).
---
## Platform Support
Core dump generation is **only available** on the following platforms:
- **Operating system**: Linux
- **Architectures**: x8632 (`__i386__`), x8664 (`__x86_64__`), ARM (`__ARM_ARCH_3__`), MIPS (`__mips__`)
On other platforms (e.g., FreeBSD, macOS, Windows) the `generate_coredump()` macro will log a warning and do nothing.
The feature relies on the **coredumper** library (https://github.com/elastic/coredumper), which is bundled as a dependency.
---
## Internal Implementation Details
### Macros and Functions
- `generate_coredump()`: The macro used in the source code to conditionally generate a core dump. It checks `coredump_enabled` and looks up the current `__FILE__:__LINE__` in the filter set.
- `proxy_coredump_load_filters()`: Loads a set of `"filename:line"` strings into the internal hash table.
- `proxy_coredump_generate()`: Performs the actual core dump writing, subject to ratelimiting checks.
- `proxy_coredump_reset_stats()`: Resets the generation counter and lastcreation timestamp.
### DatabasetoRuntime Flow
1. `LOAD COREDUMP TO RUNTIME` calls `ProxySQL_Admin::load_coredump_to_runtime()`.
2. Which calls `flush_coredump_filters_database_to_runtime()`.
3. Reads `coredump_filters` table, builds the string set, and passes it to `proxy_coredump_load_filters()`.
4. The runtime state is mirrored to `runtime_coredump_filters` via `dump_coredump_filter_values_table()`.
### Where `generate_coredump()` is Used
Currently, the macro is placed in a few strategic “fatal error” locations:
- `MySQL_Data_Stream::check_data_flow()` when data exists at both ends of a MySQL data stream.
- `PgSQL_Data_Stream::check_data_flow()` analogous condition for PostgreSQL.
Developers can add more `generate_coredump()` calls in other debugsensitive code sections.
---
## Troubleshooting
### No core file is generated even though the filter was hit
1. **Check platform support**: Verify ProxySQL is running on a supported Linux architecture.
2. **Check ratelimiting counters**: The global threshold may have been reached. Execute `LOAD COREDUMP TO RUNTIME` to reset the counters (or restart ProxySQL).
3. **Check directory permissions**: The process must have write permission in the current working directory.
4. **Check disk space**: Ensure there is sufficient free disk space.
### Error “Coredump generation is not supported on this platform.”
The platform is not among the supported architectures. Use a different machine or consider using a debugger instead.
### Filters are not being activated after `LOAD COREDUMP TO RUNTIME`
- Verify that the `filename` matches exactly the string that `__FILE__` expands to (relative path from the source root).
- Ensure the line number is correct (check the source code for the exact line where `generate_coredump()` appears).
- Inspect the `runtime_coredump_filters` table to confirm the filters were loaded.
### High frequency of core dumps is affecting performance
Increase `admincoredump_generation_interval_ms` to space out the generation, or reduce `admincoredump_generation_threshold` to limit the total number.
---
## Best Practices
1. **Use for debugging only**: Enable coredump filters only during debugging sessions. Remove filters afterward to avoid unnecessary overhead.
2. **Limit the threshold**: Keep `admincoredump_generation_threshold` low (e.g., 15) unless you are investigating a recurring issue.
3. **Set a reasonable interval**: A minimum interval of several seconds (e.g., 30000ms) prevents storm generation.
4. **Document filter locations**: Keep a record of why each filter was inserted and under what condition it triggers.
5. **Monitor disk usage**: Core files can be large; ensure the working directory has enough space and consider a dedicated partition.
---
## Related Features
- **Debug filters**: ProxySQL also supports `debug_filters` for enabling debug logs at specific fileline locations.
- **Core dump on crash**: For crashinduced core dumps, use systemlevel configuration (e.g., `ulimit -c unlimited`, `sysctl kernel.core_pattern`).
---
## Summary
The `coredump_filters` feature provides a targeted, ratelimited way to obtain core dumps from specific code locations without restarting ProxySQL or building a debug binary. It is a valuable tool for diagnosing elusive bugs in productionlike environments.
Remember that core dump generation is **platformdependent** and **ratelimited**; always verify support and adjust the configuration variables according to your debugging needs.

@ -0,0 +1,215 @@
# ProxySQL: Understanding "Closing killed client connection" Warnings
## Introduction
ProxySQL logs the message `"Closing killed client connection <IP>:<PORT>"` as a final cleanup step when a client session that has already been marked for termination is removed from a worker thread. This warning appears **only after** the session has been marked as killed (`killed=true`), and it does **not** indicate the **reason** for the kill.
To diagnose why connections are being killed, you must look for earlier log entries that explain **why** the session was marked as killed in the first place.
## TwoPhase Kill Process
ProxySQL handles client connection termination in two distinct phases:
1. **Kill Decision Phase**: A condition triggers the session to be marked as killed (`killed=true`).
- **With warning**: Most timeoutbased mechanisms log a `"Killing client connection … because …"` warning at this point.
- **Without warning**: Explicit kill commands (admin interface, client KILL statements) set `killed=true` silently.
2. **Cleanup Phase**: The worker thread detects `killed=true` and performs final cleanup, logging:
`"Closing killed client connection <IP>:<PORT>"`
**Key Insight**: If you see **only** the `"Closing killed client connection …"` warning without a preceding `"Killing client connection because …"` message, the kill was triggered by an explicit command, **not** by a timeout.
---
## Kill Triggers That Log a Warning
These mechanisms log a `"Killing client connection … because …"` warning **before** setting `killed=true`.
If any of these are the cause, you will see **both** the reason warning **and** the final cleanup warning.
### 1. **Idle Timeout** (`wait_timeout`)
- **Trigger**: Client connection inactive (no queries) longer than `wait_timeout`.
- **Warning**:
`"Killing client connection <IP>:<PORT> because inactive for <ms>ms"`
- **Configuration**:
- `mysqlwait_timeout` (global)
- Clientspecific `wait_timeout` (if set via `SET wait_timeout=…`)
### 2. **Transaction Idle Timeout** (`max_transaction_idle_time`)
- **Trigger**: A transaction has been started but remains idle (no statements executed) longer than `max_transaction_idle_time`.
- **Warning**:
`"Killing client connection <IP>:<PORT> because of (possible) transaction idle for <ms>ms"`
- **Configuration**: `mysqlmax_transaction_idle_time`
### 3. **Transaction Running Timeout** (`max_transaction_time`)
- **Trigger**: A transaction has been actively running (executing statements) longer than `max_transaction_time`.
- **Warning**:
`"Killing client connection <IP>:<PORT> because of (possible) transaction running for <ms>ms"`
- **Configuration**: `mysqlmax_transaction_time`
### 4. **FastForward Mode with Offline Backends**
- **Trigger**: Session is in `session_fast_forward` mode and all backends are OFFLINE.
- **Warning**:
`"Killing client connection <IP>:<PORT> due to 'session_fast_forward' and offline backends"`
- **Configuration**: `mysqlsession_fast_forward`
---
## Kill Triggers That Do **Not** Log a Warning
These triggers set `killed=true` **without** logging a `"Killing client connection because …"` warning.
You will see **only** the final `"Closing killed client connection …"` message.
### 1. **AdminInitiated Kill**
- **Trigger**: `KILL CONNECTION` command executed via the ProxySQL Admin interface.
- **Path**: `MySQL_Threads_Handler::kill_session()` → sets `killed=true` directly.
- **No warning logged** at kill decision time.
### 2. **KillQueue Request**
- **Trigger**: `kill_connection_or_query()` called by:
- MySQL client `KILL CONNECTION` or `KILL QUERY` statements
- Admininterface kill commands (same as above)
- **Path**:
`kill_connection_or_query()` → places request in perthread queue → `Scan_Sessions_to_Kill()` processes queue → sets `killed=true`.
- **No warning logged** when `killed=true` is set.
---
## Log Analysis and Troubleshooting Guide
### Scenario 1: You see **both** warnings
```
[WARNING] Killing client connection 192.168.123.45:56789 because inactive for 3600000ms
[WARNING] Closing killed client connection 192.168.123.45:56789
```
**Diagnosis**: A timeout mechanism killed the connection.
**Action**: Adjust the relevant timeout variable (`wait_timeout`, `max_transaction_idle_time`, etc.) or investigate why the client stayed idle so long.
### Scenario 2: You see **only** the cleanup warning
```
[WARNING] Closing killed client connection 192.168.123.45:56789
```
**Diagnosis**: The connection was killed by an explicit command (admin or client KILL).
**Action**:
1. Check if any component (application, connection pool, admin script) is issuing `KILL CONNECTION` commands.
2. Review ProxySQL admin logs for `KILL` commands.
3. Check application logs for connectionpool cleanup activities.
### Scenario 3: Many cleanup warnings appear unexpectedly
**Possible Causes**:
1. **Connectionpool cleanup**: Pools that aggressively close idle connections may issue `KILL` commands.
2. **Admin automation**: Scripts or monitoring tools that kill “stuck” connections.
3. **Client applications**: Applications that manually kill their own connections.
**Investigation Steps**:
1. **Enable audit logging**:
```sql
SET mysql-auditlog_filename='/var/log/proxysql_audit.log';
LOAD MYSQL VARIABLES TO RUNTIME;
```
Audit logs capture `KILL` commands with source IP and username.
2. **Check `stats_mysql_commands_counters`**:
```sql
SELECT * FROM stats_mysql_commands_counters WHERE Command='Kill';
```
Shows how many `KILL` commands have been executed.
3. **Monitor active kills**:
```sql
SELECT * FROM stats_mysql_processlist WHERE info LIKE 'KILL%';
```
Shows currently executing `KILL` commands.
4. **Review clientside logs**: Look for connectionpool or applicationlayer kill patterns.
---
## Configuration Parameters Reference
| Variable | Default | Description |
|----------|---------|-------------|
| `mysqlwait_timeout` | 28800000 ms (8 hours) | Maximum idle time before connection is killed. |
| `mysqlmax_transaction_idle_time` | 0 (disabled) | Maximum idle time for an open transaction. |
| `mysqlmax_transaction_time` | 0 (disabled) | Maximum total time a transaction can run. |
| `mysqlsession_fast_forward` | false | Enable fastforward mode (kills connections if backends go OFFLINE). |
| `mysqlthrottle_max_transaction_time` | 0 (disabled) | Alternative to `max_transaction_time`; throttles instead of kills. |
**Note**: Timeouts are expressed in **milliseconds**. A value of `0` disables the timeout.
---
## Best Practices for Monitoring and Handling
### 1. **Differentiate Between Expected and Unexpected Kills**
- **Expected**: Connectionpool cleanup, scheduled maintenance, applicationcontrolled termination.
- **Unexpected**: Unknown source of `KILL` commands, timeouts that are too aggressive for your workload.
### 2. **Set Appropriate Timeouts**
- **Production**: Align `wait_timeout` with your applications connectionpool `idleTimeout`.
- **Transactions**: Enable `max_transaction_idle_time` and `max_transaction_time` to prevent runaway transactions.
- **Testing**: Start with conservative values and adjust based on observed behavior.
### 3. **Use Audit Logging for Forensic Analysis**
```sql
-- Enable audit logging
SET mysql-auditlog_filename='/var/log/proxysql_audit.log';
SET mysql-auditlog_filesize=1000000;
SET mysql-auditlog=true;
LOAD MYSQL VARIABLES TO RUNTIME;
SAVE MYSQL VARIABLES TO DISK;
```
Audit logs record every `KILL` command with timestamp, client IP, and username.
### 4. **Monitor Kill Statistics**
```sql
-- Track kill sources
SELECT
SUM(CASE WHEN Command='Kill' THEN Total_Time_us ELSE 0 END) AS kill_time_us,
SUM(CASE WHEN Command='Kill' THEN cnt ELSE 0 END) AS kill_count
FROM stats_mysql_commands_counters;
-- Check for recent kills in the processlist
SELECT * FROM stats_mysql_processlist
WHERE info LIKE 'KILL%'
ORDER BY time_ms DESC
LIMIT 10;
```
### 5. **Responding to Unexpected Kill Storms**
1. **Identify the source** via audit logs and `stats_mysql_commands_counters`.
2. **If client applications**: Coordinate with developers to adjust connectionpool settings.
3. **If admin scripts**: Review automation logic and add appropriate guards.
4. **If unknown**: Temporarily enable verbose logging (`mysqlquery_processor_log`) to capture more context.
---
## Common Questions and Answers
### Q: Why dont I see a “Killing client connection because …” warning?
**A**: The kill was triggered by an explicit `KILL` command (admin or client), not by a timeout. Explicit kills do not log a reason at the moment `killed=true` is set.
### Q: Can I disable the “Closing killed client connection” warnings?
**A**: Yes, by lowering the logverbosity level. However, doing so removes visibility into connection termination. Instead, investigate and address the root cause of the kills.
### Q: Are PostgreSQL connections handled differently?
**A**: The same twophase pattern applies to PostgreSQL, with analogous timeout variables (`pgsqlwait_timeout`, etc.) and kill mechanisms. The warning messages are similar but may appear in `PgSQL_Thread` instead of `MySQL_Thread`.
### Q: How can I distinguish between a client `KILL` and an admin `KILL`?
**A**: Audit logs show the source IP and username. Client `KILL` commands originate from application IPs; admin `KILL` commands come from the admininterface IP (usually `127.0.0.1` or your admin network).
### Q: What should I do if kills are causing application errors?
**A**:
1. Verify timeout values match your applications expected behavior.
2. Ensure connection pools are configured to `KILL` connections gracefully (e.g., with `COM_QUIT` instead of `KILL CONNECTION`).
3. Consider increasing timeouts temporarily while diagnosing.
---
## Summary
The `"Closing killed client connection …"` warning is a **cleanup message**, not a **rootcause indicator**. Diagnosing why connections are killed requires examining earlier logs for `"Killing client connection because …"` warnings or identifying explicit `KILL` commands via audit logs and statistics.
- **Timeout kills** → preceded by a reason warning.
- **Explicit kills** → no preceding reason warning.
Use the troubleshooting steps and monitoring practices outlined above to identify the source of kills and adjust your configuration or application behavior accordingly.

@ -0,0 +1,304 @@
# ProxySQL Query Rules: Capture Groups and Backreferences
## Introduction
ProxySQL's query rules engine supports regular expression capture groups and backreferences, allowing sophisticated query rewriting. This document explains how to use these features to transform SQL queries dynamically.
## Core Concepts
### Table Columns for Pattern Matching
| Column | Purpose | Example |
|--------|---------|---------|
| `digest` | Hash of the normalized query pattern | `0x1d2cc217c860282` |
| `match_digest` | Normalized query pattern with placeholders | `SELECT * FROM users WHERE id = ?` |
| `match_pattern` | Raw query pattern with regex groups | `SELECT (.*) FROM users WHERE id = (\d+)` |
| `replace_pattern` | Replacement pattern with backreferences | `SELECT \1 FROM customers WHERE user_id = \2` |
| `re_modifiers` | Regex modifiers | `'CASELESS'` or `'CASELESS,GLOBAL'` |
### Regex Engine Support
ProxySQL supports two regex engines (configurable via `mysql-query_processor_regex`):
- **PCRE** (default, `query_processor_regex=1`): Full regex support including capture groups
- **RE2** (`query_processor_regex=2`): Google's RE2 library, supports capture groups in replacement patterns
Both engines support backreferences (`\1`, `\2`, etc.) in `replace_pattern`.
## Basic Syntax
### Capture Groups in match_pattern
Use parentheses `()` to define capture groups:
```sql
-- Two capture groups: column list and WHERE clause
INSERT INTO mysql_query_rules (
match_pattern, replace_pattern, apply
) VALUES (
'SELECT (.*) FROM users WHERE (.*)',
'SELECT \1 FROM customers WHERE \2',
1
);
```
### Backreferences in replace_pattern
Reference captured groups with `\1`, `\2`, etc.:
```sql
-- \1 = column list, \2 = WHERE conditions
'\1 FROM modified_table WHERE \2'
```
**Important**: Use single backslash (`\1`), not double (`\\1`), in the SQL INSERT statement.
## Practical Examples
### Example 1: Changing Table Names While Preserving Query Structure
**Goal**: Rewrite queries from `old_table` to `new_table` while keeping all other parts unchanged.
```sql
INSERT INTO mysql_query_rules (
rule_id, active, match_pattern, replace_pattern, re_modifiers, apply
) VALUES (
1, 1,
'(SELECT .* FROM )old_table( WHERE .*)',
'\1new_table\2',
'CASELESS',
1
);
```
**Matches**: `SELECT id, name FROM old_table WHERE status = 'active'`
**Becomes**: `SELECT id, name FROM new_table WHERE status = 'active'`
### Example 2: Adding Hints to Specific Queries
**Goal**: Add `FORCE INDEX (primary)` to SELECT queries on `orders` table with specific conditions.
```sql
INSERT INTO mysql_query_rules (
rule_id, active, digest, match_pattern, replace_pattern, apply
) VALUES (
2, 1, '0x1234567890abcdef',
'(SELECT .* FROM orders)( WHERE customer_id = \d+.*)',
'\1 FORCE INDEX (primary)\2',
1
);
```
**Matches**: `SELECT * FROM orders WHERE customer_id = 100 AND date > '2024-01-01'`
**Becomes**: `SELECT * FROM orders FORCE INDEX (primary) WHERE customer_id = 100 AND date > '2024-01-01'`
### Example 3: Column Renaming in SELECT Queries
**Goal**: Rename column `legacy_id` to `new_id` in all SELECT queries.
```sql
INSERT INTO mysql_query_rules (
rule_id, active, match_pattern, replace_pattern, re_modifiers, apply
) VALUES (
3, 1,
'(SELECT.*?)legacy_id(.*FROM.*)',
'\1new_id\2',
'CASELESS',
1
);
```
**Matches**: `SELECT legacy_id, name FROM products WHERE category = 'electronics'`
**Becomes**: `SELECT new_id, name FROM products WHERE category = 'electronics'`
### Example 4: Conditional Rewriting Based on Values
**Goal**: Add `USE INDEX` hint only for queries with `status = 'pending'`.
```sql
INSERT INTO mysql_query_rules (
rule_id, active, match_pattern, replace_pattern, re_modifiers, apply
) VALUES (
4, 1,
'(SELECT .* FROM tasks)( WHERE.*status\s*=\s*''pending''.*)',
'\1 USE INDEX (idx_status)\2',
'CASELESS',
1
);
```
**Matches**: `SELECT * FROM tasks WHERE status = 'pending' AND due_date < NOW()`
**Becomes**: `SELECT * FROM tasks USE INDEX (idx_status) WHERE status = 'pending' AND due_date < NOW()`
### Example 5: Complex Multi-Group Rewriting
**Goal**: Reorder WHERE clause conditions and add optimizer hint.
```sql
INSERT INTO mysql_query_rules (
rule_id, active, match_pattern, replace_pattern, re_modifiers, apply
) VALUES (
5, 1,
'SELECT (.*) FROM (\w+) WHERE (column1 = .*) AND (column2 = .*)',
'SELECT /*+ MAX_EXECUTION_TIME(1000) */ \1 FROM \2 WHERE \4 AND \3',
'CASELESS',
1
);
```
**Matches**: `SELECT id, name FROM accounts WHERE column1 = 'value1' AND column2 = 'value2'`
**Becomes**: `SELECT /*+ MAX_EXECUTION_TIME(1000) */ id, name FROM accounts WHERE column2 = 'value2' AND column1 = 'value1'`
## Advanced Techniques
### Combining digest and match_pattern
For precise targeting, combine `digest` (hash of normalized query) with `match_pattern` (specific values):
```sql
INSERT INTO mysql_query_rules (
rule_id, active, digest, match_pattern, replace_pattern, apply
) VALUES (
6, 1, '0xa1b2c3d4e5f67890',
'(SELECT .* FROM users)( WHERE id = 12345.*)',
'\1 FORCE INDEX (primary)\2',
1
);
```
### Using re_modifiers
- `CASELESS`: Case-insensitive matching
- `GLOBAL`: Replace all occurrences (not just first)
```sql
INSERT INTO mysql_query_rules (
rule_id, active, match_pattern, re_modifiers, replace_pattern, apply
) VALUES (
7, 1,
'(SELECT)(.*)(FROM)(.*)',
'CASELESS,GLOBAL',
'\1 SQL_NO_CACHE \2\3\4',
1
);
```
### Rule Chaining with flagIN/flagOUT
For complex transformations, chain rules using flags:
```sql
-- Rule 1: Match pattern and set flagOUT
INSERT INTO mysql_query_rules (
rule_id, active, match_pattern, flagOUT, apply
) VALUES (
8, 1, 'SELECT .* FROM sensitive_table', 100, 0
);
-- Rule 2: Apply transformation only when flagIN matches
INSERT INTO mysql_query_rules (
rule_id, active, flagIN, match_pattern, replace_pattern, apply
) VALUES (
9, 1, 100,
'(SELECT .* FROM )sensitive_table( WHERE .*)',
'\1audited_sensitive_table\2',
1
);
```
## Testing and Validation
### 1. Verify Rule Matching
```sql
-- Check stats for specific rule
SELECT * FROM stats_mysql_query_rules WHERE rule_id = 1;
-- Test pattern matching
SELECT * FROM mysql_query_rules
WHERE match_pattern = '(SELECT .* FROM )old_table( WHERE .*)';
```
### 2. Monitor Rule Performance
```sql
-- View hits and performance
SELECT rule_id, hits, mysql_query_rules.match_pattern,
sum_time, min_time, max_time
FROM stats_mysql_query_rules
JOIN mysql_query_rules USING (rule_id)
ORDER BY hits DESC;
```
## Common Pitfalls and Solutions
### Problem 1: Groups Not Capturing Entire Needed Text
**Symptom**: Replacement loses part of the original query.
**Solution**: Expand capture groups to include more context:
```sql
-- Before (loses WHERE clause):
'(SELECT .* FROM table)( WHERE)'
-- After (captures entire WHERE clause):
'(SELECT .* FROM table)( WHERE.*)'
```
### Problem 2: Backreferences Not Working
**Symptom**: `\1` appears literally in output instead of replaced text.
**Solution**: Ensure:
1. Parentheses in `match_pattern` define capture groups
2. `replace_pattern` uses `\1`, not `\\1` or `$1`
3. Rule is active (`active = 1`)
### Problem 3: Overly Broad Matching
**Symptom**: Rule applies to unintended queries.
**Solution**: Add more specific constraints:
- Use `digest` column to restrict to specific query patterns
- Add `username` or `schemaname` restrictions
- Make `match_pattern` more specific
```sql
INSERT INTO mysql_query_rules (
active, username, digest, match_pattern, replace_pattern, apply
) VALUES (
1, 'app_user', '0x1234567890abcdef',
'(SELECT .* FROM orders)( WHERE .*)',
'\1 FORCE INDEX (primary)\2',
1
);
```
## Best Practices
1. **Test Incrementally**: Start with simple patterns, then add complexity.
2. **Use digest for Precision**: Combine `digest` with `match_pattern` for exact targeting.
3. **Case-Insensitive by Default**: Use `re_modifiers = 'CASELESS'` unless case sensitivity is required.
4. **Monitor Performance**: Regularly check `stats_mysql_query_rules` for rule hits and timing.
5. **Document Rules**: Add comments to rules explaining their purpose:
```sql
INSERT INTO mysql_query_rules (
rule_id, active, match_pattern, replace_pattern, apply, comment
) VALUES (
10, 1,
'(SELECT .* FROM )products( WHERE .*)',
'\1products_v2\2',
1,
'Rewrite: Route queries from products to products_v2 table'
);
```
6. **Version Control**: Keep query rule definitions in version-controlled SQL files.
## Conclusion
ProxySQL's capture group and backreference capabilities provide powerful query rewriting options. By understanding how to properly structure `match_pattern` with parentheses and reference captured groups with `\1`, `\2` in `replace_pattern`, you can implement sophisticated query transformations while maintaining query correctness.
Always test rules in a staging environment before deploying to production, and monitor their impact on query performance and correctness.

@ -298,6 +298,19 @@
#define STATS_SQLITE_TABLE_PGSQL_FREE_CONNECTIONS "CREATE TABLE stats_pgsql_free_connections (fd INT NOT NULL , hostgroup INT NOT NULL , srv_host VARCHAR NOT NULL , srv_port INT NOT NULL , user VARCHAR NOT NULL , database VARCHAR , init_connect VARCHAR , time_zone VARCHAR , sql_mode VARCHAR , idle_ms INT , statistics VARCHAR , pgsql_info VARCHAR)"
#define STATS_SQLITE_TABLE_PGSQL_USERS "CREATE TABLE stats_pgsql_users (username VARCHAR PRIMARY KEY , frontend_connections INT NOT NULL , frontend_max_connections INT NOT NULL)"
#define STATS_SQLITE_TABLE_PGSQL_PROCESSLIST "CREATE TABLE stats_pgsql_processlist (ThreadID INT NOT NULL , SessionID INTEGER PRIMARY KEY , user VARCHAR , database VARCHAR , cli_host VARCHAR , cli_port INT , hostgroup INT , l_srv_host VARCHAR , l_srv_port INT , srv_host VARCHAR , srv_port INT , backend_pid INT , backend_state VARCHAR , command VARCHAR , time_ms INT NOT NULL , info VARCHAR , status_flags INT , extended_info VARCHAR)"
/**
* @brief PostgreSQL `pg_stat_activity`compatible view.
*
* This is a SQL VIEW, not a table. It provides a `pg_stat_activity`like
* interface to the data stored in `stats_pgsql_processlist`. Because it is a
* view, attempting to execute `DELETE` on it will fail with SQLite error:
* `"cannot modify stats_pgsql_stat_activity because it is a view"`.
*
* @note This view must be excluded from any bulkdelete operations on
* statistics tables (e.g., in `ProxySQL_Admin::vacuum_stats()`).
* Deleting rows from the underlying `stats_pgsql_processlist` table
* automatically clears the view's content.
*/
#define STATS_SQLITE_TABLE_PGSQL_STAT_ACTIVITY "CREATE VIEW stats_pgsql_stat_activity AS SELECT ThreadID AS thread_id, database AS datname, SessionID AS pid, user AS usename, cli_host AS client_addr, cli_port AS client_port, hostgroup, l_srv_host, l_srv_port, srv_host, srv_port, backend_pid, backend_state AS state, command, time_ms AS duration_ms, info as query, status_flags, extended_info FROM stats_pgsql_processlist"
#define STATS_SQLITE_TABLE_PGSQL_ERRORS "CREATE TABLE stats_pgsql_errors (hostgroup INT NOT NULL , hostname VARCHAR NOT NULL , port INT NOT NULL , username VARCHAR NOT NULL , client_address VARCHAR NOT NULL , database VARCHAR NOT NULL , sqlstate VARCHAR NOT NULL , count_star INTEGER NOT NULL , first_seen INTEGER NOT NULL , last_seen INTEGER NOT NULL , last_error VARCHAR NOT NULL DEFAULT '' , PRIMARY KEY (hostgroup, hostname, port, username, database, sqlstate) )"
#define STATS_SQLITE_TABLE_PGSQL_ERRORS_RESET "CREATE TABLE stats_pgsql_errors_reset (hostgroup INT NOT NULL , hostname VARCHAR NOT NULL , port INT NOT NULL , username VARCHAR NOT NULL , client_address VARCHAR NOT NULL , database VARCHAR NOT NULL , sqlstate VARCHAR NOT NULL , count_star INTEGER NOT NULL , first_seen INTEGER NOT NULL , last_seen INTEGER NOT NULL , last_error VARCHAR NOT NULL DEFAULT '' , PRIMARY KEY (hostgroup, hostname, port, username, database, sqlstate) )"

@ -1888,6 +1888,35 @@ SQLite3_result * ProxySQL_Admin::generate_show_table_status(const char *tablenam
template<typename S>
void admin_session_handler(S* sess, void *_pa, PtrSize_t *pkt);
/**
* @brief Delete all rows from statistics tables and vacuum the database.
*
* This function is called when `TRUNCATE` commands are executed on statistics
* tables via the Admin interface. It performs two operations:
* 1. Deletes all rows from a predefined list of statistics tables (and their
* `*_reset` counterparts).
* 2. Executes `VACUUM` on the statistics database to reclaim space.
*
* The function respects the `variables.vacuum_stats` setting: if `false`,
* the function returns immediately without performing any operation.
*
* @param is_admin If `true`, operate on the `stats` schema within the admin
* database (`stats.*` tables). If `false`, operate on the
* standalone statistics database.
*
* @note The list of tables includes both MySQL and PostgreSQL statistics
* tables, even when the trigger is a MySQL-specific `TRUNCATE`. This
* ensures statistics are fully cleared regardless of the protocol that
* initiated the operation.
*
* @warning The table `stats_pgsql_stat_activity` is explicitly excluded from
* the deletion list because it is defined as a SQL VIEW (see
* `STATS_SQLITE_TABLE_PGSQL_STAT_ACTIVITY`). Attempting to `DELETE`
* from a view would cause a SQLite error:
* `"cannot modify stats_pgsql_stat_activity because it is a view"`.
* The view is based on `stats_pgsql_processlist`; clearing the
* underlying table automatically clears the view's content.
*/
void ProxySQL_Admin::vacuum_stats(bool is_admin) {
if (variables.vacuum_stats==false) {
return;
@ -1905,7 +1934,7 @@ void ProxySQL_Admin::vacuum_stats(bool is_admin) {
"stats_pgsql_prepared_statements_info",
"stats_mysql_processlist",
"stats_pgsql_processlist",
"stats_pgsql_stat_activity",
//"stats_pgsql_stat_activity", // VIEW, not a table; DELETE would fail
"stats_mysql_query_digest",
"stats_mysql_query_digest_reset",
"stats_pgsql_query_digest",

Loading…
Cancel
Save