test(read_only_offline_hard): set default_hostgroup to WHG, not hardcoded 0

Part of sysown/proxysql#5610 — third of four tests from the flake
tracking issue. Like pgsql-ssl_keylog-t, this turned out to be a
deterministic failure miscategorized as a flake.

## Background: the incomplete hostgroup migration

Two commits on v3.0 main set up the stage:

1. 26c5a2572 ("Fix read_only test assuming default_hostgroup=0"):
   fixed the ORIGINAL bug where the test's writer hostgroup was 0
   but the CI infra set user.default_hostgroup to 1300. The fix set
   default_hostgroup=0 at the start of each scenario so the user
   routing matched the test's servers.

2. 4f9ab49e7 ("Fix hardcoded hostgroups in TAP tests for mysql84"):
   introduced TAP_MYSQL8_BACKEND_HG env var so the test's writer/
   reader hostgroups could be overridden for mysql84 (WHG=2900, RHG=
   2901 in the mysql84-g4 group). init_hostgroups() at file scope
   reads this env var into the static `WHG` and `RHG` symbols.

The second commit updated the `INSERT INTO mysql_servers` calls to
use WHG/RHG, but missed the `UPDATE mysql_users SET default_hostgroup=0`
added by the first commit. So on mysql84-g4:

 - insert_mysql_servers_records() puts servers in hostgroup 2900 (WHG)
 - UPDATE mysql_users SET default_hostgroup=0 points the user at
   hostgroup 0, which has no servers
 - the test opens a proxy connection as that user, runs BEGIN, and
   ProxySQL routes BEGIN to hostgroup 0 — which is empty
 - 10 s later: "Max connect timeout reached while reaching hostgroup 0"
 - test aborts at subtest 4 of 18, not a flake

Verified by querying the running proxysql during a failing run:

  SELECT username, default_hostgroup FROM mysql_users;
  -> root     | 0
     user     | 2900
     testuser | 2900
     ...

 `root` is the user the test connects as (cl.root_username). Still
 at 0 after the setup step tried to make it match. In mysql84-g4
 that's wrong.

## Fix

Change the two `string_format("UPDATE mysql_users SET default_hostgroup=0 ...")`
calls in test_scenario_1 and test_scenario_2 to use the `WHG` static
symbol (already populated from TAP_MYSQL8_BACKEND_HG by
init_hostgroups() earlier in the same file), so the user's
default_hostgroup always matches whatever writer hostgroup the test
configured its servers into. Zero behavior change on the legacy infra
(where WHG defaults to 0 so the literal value is unchanged); correct
behavior on the mysql84 infra (where WHG=2900).

Same two-line pattern in both scenario functions.

## Local verification (dogfood of test/README.md §"Debugging a flaky test")

3 consecutive local iterations in mysql84-g4 +
TEST_PY_TAP_INCL=test_read_only_actions_offline_hard_servers-t:

  attempt 1: PASS
  attempt 2: PASS
  attempt 3: PASS
  === read_only_actions: 3/3 ===

Pre-fix: 0/3 (all hit the "Max connect timeout reached while reaching
hostgroup 0 after 10000ms" error and aborted at subtest 4 of 18).
Post-fix: 18/18 subtests run and pass on every iteration.

## Not touched here

- `pgsql-servers_ssl_params-t` — still stuck on the pgsql monitor
  not making SSL connections under `use_ssl=1`. Same open question
  as subtest 7 of pgsql-ssl_keylog-t (skipped in PR #5612). Needs
  pgsql monitor source analysis, not a test-side fix.
v3.0-fix-read-only-actions-hostgroup
Rene Cannao 1 month ago
parent 04466b35a5
commit 5d35a213ce

@ -223,10 +223,19 @@ int test_scenario_1(MYSQL* proxy_admin, const CommandLine& cl) {
MYSQL_QUERY__(proxy_admin, "DELETE FROM mysql_replication_hostgroups");
MYSQL_QUERY__(proxy_admin, "LOAD MYSQL SERVERS TO RUNTIME");
// Set default_hostgroup=0 to match writer_hostgroup used in this test
// Set default_hostgroup to match writer_hostgroup (WHG) used in this
// test. WHG defaults to 0 for the legacy infra but is overridden to
// 2900 by the mysql84-g4 group via TAP_MYSQL8_BACKEND_HG — see
// init_hostgroups() above. If we hardcode 0 here, the mysql84 group
// routes the test's BEGIN query to an empty hostgroup 0 and times
// out at 10 s. A previous fix (26c5a2572) hardcoded 0 because it
// predated the mysql84 hostgroup migration in 4f9ab49e7 — this
// commit finishes that fix by making the default_hostgroup follow
// WHG at runtime.
{
std::string update_user;
string_format("UPDATE mysql_users SET default_hostgroup=0 WHERE username='%s'", update_user, cl.root_username);
string_format("UPDATE mysql_users SET default_hostgroup=%d WHERE username='%s'",
update_user, WHG, cl.root_username);
MYSQL_QUERY__(proxy_admin, update_user.c_str());
MYSQL_QUERY__(proxy_admin, "LOAD MYSQL USERS TO RUNTIME");
}
@ -384,10 +393,19 @@ int test_scenario_2(MYSQL* proxy_admin, const CommandLine& cl) {
MYSQL_QUERY__(proxy_admin, "DELETE FROM mysql_replication_hostgroups");
MYSQL_QUERY__(proxy_admin, "LOAD MYSQL SERVERS TO RUNTIME");
// Set default_hostgroup=0 to match writer_hostgroup used in this test
// Set default_hostgroup to match writer_hostgroup (WHG) used in this
// test. WHG defaults to 0 for the legacy infra but is overridden to
// 2900 by the mysql84-g4 group via TAP_MYSQL8_BACKEND_HG — see
// init_hostgroups() above. If we hardcode 0 here, the mysql84 group
// routes the test's BEGIN query to an empty hostgroup 0 and times
// out at 10 s. A previous fix (26c5a2572) hardcoded 0 because it
// predated the mysql84 hostgroup migration in 4f9ab49e7 — this
// commit finishes that fix by making the default_hostgroup follow
// WHG at runtime.
{
std::string update_user;
string_format("UPDATE mysql_users SET default_hostgroup=0 WHERE username='%s'", update_user, cl.root_username);
string_format("UPDATE mysql_users SET default_hostgroup=%d WHERE username='%s'",
update_user, WHG, cl.root_username);
MYSQL_QUERY__(proxy_admin, update_user.c_str());
MYSQL_QUERY__(proxy_admin, "LOAD MYSQL USERS TO RUNTIME");
}

Loading…
Cancel
Save