test(mysql): enlarge grace close binlog to ~50 MB to un-flake target>8s iterations

fast_forward_grace_close-t writes a binlog, then for each target_time in {0..7, 20, 30, 60} reads it via a throttled client and asserts that target_time <= 8s reaches EOF and target_time > 8s does not (because the 8s grace_close_ms should cut the session before the binlog is drained). The >8s iterations have been flaky on GitHub runners. Analysis of the actual CI logs (proxysql.log timestamps vs test TAP output) shows what goes wrong: during the 8s grace window, ProxySQL keeps pushing bytes into the client's kernel recv buffer as fast as the buffer accepts them. If the whole binlog — including the empty-event EOF marker — fits into "what the client has already read" + "what is sitting in the client's recv buffer" by t=8s, grace_close does fire and cut the session, but the client's next mysql_binlog_fetch just returns the already-buffered EOF event and reports reached_EOF=TRUE. The test then fails the "Expected FALSE" assertion. With total_bytes = N and client recv-buffer capacity R, the test is only robust when N > ~1.7*R. Prior bumps (3 events -> 50 events, 2026-04-13) lifted N from 150 KB to 2.5 MB, which is enough on dbdeployer but too close to the margin on GH runners whose TCP recv autotune can grow past a couple of MB. 1000 x 50 KB = ~50 MB puts N well above any realistic R (kernels cap autotune around a few MB), so the grace close always fires while there is still many MB of undelivered binlog sitting in ProxySQL's output queue — the client cannot reach EOF. Side effects on other tests: none. The only other binlog-reading test in the same group, fast_forward_switch_replication_deprecate_eof-t, is already incidentally reading grace_close's binlog (it picks the first file from SHOW BINARY LOGS, which ends up being grace_close's rotated file). It reads at full speed and only asserts reached_EOF==true, which stays correct — it just reports ~50 MB instead of ~2.5 MB in its informational TAP message. Full-speed read of 50 MB over the Docker bridge is sub-second per config (4 configs -> ~1-2 s added). Data generation cost at test start is ~1-2 s for the extra INSERTs.
4 weeks ago · e24ac092f6
parent 8da7a71a44
commit e24ac092f6
1 changed files with 10 additions and 4 deletions
--- a/test/tap/tests/fast_forward_grace_close.cpp
+++ b/test/tap/tests/fast_forward_grace_close.cpp
@ -59,10 +59,16 @@ int main() {
 	MYSQL_QUERY(proxysql_conn, "CREATE DATABASE IF NOT EXISTS test");
 	MYSQL_QUERY(proxysql_conn, "USE test");
 	MYSQL_QUERY(proxysql_conn, "CREATE TABLE IF NOT EXISTS dummy_log_table (id INT PRIMARY KEY AUTO_INCREMENT, data LONGTEXT)");
-	// Generate enough binlog data so that throttled reads at target_time=20s
-	// actually take longer than the 8s grace close period. Each INSERT creates
-	// a ~50KB binlog event; we need many events so the per-event sleep adds up.
-	for (int i = 0; i < 50; i++) {
+	// Generate enough binlog data that the client's kernel recv buffer can
+	// hold only a small fraction of it. For target_time > 8s, grace close
+	// must fire before the full binlog (including the empty-event EOF marker)
+	// has been delivered to the client's socket — otherwise the client sees
+	// EOF anyway and the "expected FALSE" assertions flake. With autotuned
+	// recv buffers that can grow to several MB on some kernels/runners, a
+	// small binlog lets the EOF marker slip through during the 8s grace
+	// window. 1000 x 50 KB = ~50 MB gives a comfortable multiple of any
+	// realistic recv buffer.
+	for (int i = 0; i < 1000; i++) {
 		MYSQL_QUERY(proxysql_conn, "INSERT INTO dummy_log_table (data) VALUES (REPEAT('a', 1024*50))");
 	}
 	int rc = mysql_query(proxysql_conn, "FLUSH LOGS");