Published on 05/15/2025

Troubleshooting MariaDB Table Locks on AWS RDS

At Data-Sleek, we recently worked with SocialWifi, whose production system was severely impacted by a persistent MariaDB table locking issue hosted on AWS RDS. This blog documents our multi-day effort, combining system analysis, AWS tuning, and collaboration with SocialWifi’s CEO and our DBA. We hope this post helps others navigate similar production-level issues with MariaDB table locks and AWS RDS.

Diagnosing InnoDB Table Locks During Client Engagement

From our very first Zoom meeting, it was clear that the problem was unusual. Our client explained that their application had become completely unresponsive, and rebooting the RDS instance did not fix it. Our DBA raised suspicions about long-living transactions under `REPEATABLE READ`, a common default in MariaDB that can prevent InnoDB from completing rollback operations.

Upon inspection of the InnoDB engine status, we saw the now-infamous:

---TRANSACTION 6538069445, ACTIVE 61240 sec recovered trx
ROLLING BACK 1 lock struct(s), heap size 1128, 0 row lock(s), undo log entries 15

This line indicated a transaction InnoDB was still attempting to roll back, effectively locking parts of the table from being updated, even after reboot.

“ROLLING BACK 1 lock struct” — The Silent Killer

This message from InnoDB means that a transaction had failed or been interrupted, and InnoDB was now trying to roll it back in the background. But here’s the catch — if the undo logs are too large or corrupted, or if there’s a system crash mid-rollback, InnoDB can take hours or days to complete, or never complete at all.

Even more frustrating: this transactional state is embedded in the InnoDB buffer pool. So, if you take a snapshot, clone a replica, or reboot — the stuck rollback comes with it.

MariaDB Table Locking Issues in AWS RDS

Our first recommendation was to test recovery options:

– ❌ Snapshot restore: still showed the same InnoDB rollback transaction

– ❌ Read replica creation: same issue — replica locked on startup

– ✅ Manual backup & restore into clean instance: potential path forward

But this led to new challenges.

MariaDB and the Repeatable Read Trap

Our DBA helped us confirm that `REPEATABLE READ` isolation level, used by default in MariaDB, could preserve snapshot views of data and hold locks indefinitely. In conjunction with a missing `innodb_kill_idle_transaction`, the environment was primed for lock persistence.

Client Collaboration and Real-Time Triage

During several Zoom calls, we worked closely with SocialWifi. Oscar shared webhook timeouts and Redis queue slowdowns caused by the lock. Our DBA explored triggers and confirmed they weren’t causing deadlocks. We created a game plan:

– Archive the largest tables separately

– Disable `foreign_key_checks` and `unique_checks` during restore

– Use `innodb_flush_log_at_trx_commit = 2` to improve disk I/O

Incomplete AWS Documentation About MariaDB Binlog Retention

We assumed that binlog retention would follow the `backup_retention_period`, just like in RDS MySQL.

But our data engineer flagged that replication setup kept failing due to missing binlogs. Eventually, we discovered:

> In RDS MariaDB, binary logs are NOT retained based on backup retention — they are purged unless explicitly configured.

We had to use the dedicate RDS stored procedure rds_set_configuration:

CALL mysql.rds_set_configuration('binlog retention hours', 168);

Even ChatGPT did not know about this special stored proc.

How We Fixed Persistent MariaDB Table Locks on AWS RDS

We took the following steps to resolve the issue:

1. Stood up a new RDS MariaDB instance

2. Used the famous mysqldump to take backup passing the right parameters

3. Restored using mysql using -f to force the restore.

4. Mydumper and myloader did not provide any gain (maybe additional test are needed)

4. Configured binlog retention using `CALL mysql.rds_set_configuration(…)`

5. Manually set replication using `mysql.rds_set_external_master`

After several verification rounds and cleanup tasks, everything synced up. The new instance was promoted to primary, and the locking issue disappeared.

Lessons Learned and Takeaways

1. Managed databases still need expert care

MariaDB on RDS is powerful, but when it goes sideways, you need someone who understands storage engines and replication internals.

2. “ROLLING BACK 1 lock struct” will haunt your replicas

If you clone or snapshot a broken instance, expect the problem to follow. Only a clean restore solves this.

3. Binlog retention on RDS MariaDB is not tied to backup retention

You must explicitly set binlog retention using:

sql
CALL mysql.rds_set_configuration('binlog retention hours', 168);

4. Tune your restores

Use `mydumper`, disable constraints, increase instance size, and monitor IOPS.

5. Communication is everything

Daily syncs with Oscar and Bala allowed us to iterate, validate, and keep the business informed during a high-stakes outage.

Conclusion: Even with a Managed Database, You Still Need Expert Oversight.

When production databases lock up, there are no shortcuts. At Data-Sleek, we bring deep expertise, creative diagnostics, and tailored solutions to even the gnarliest MySQL/MariaDB issues.

If you’re running a database that’s growing in size or complexity, don’t wait until it breaks. Proactive tuning, binlog configuration, and transaction monitoring can save you hours—or days—of recovery later.

👉 Need help with MariaDB, RDS, or performance firefighting? Let’s talk.

FAQ: Troubleshooting MariaDB Table Locks on AWS RDS

Why does MariaDB table locking happen on AWS RDS?

MariaDB table locks on AWS RDS usually happen because of long-running transactions stuck in rollback, often under the default REPEATABLE READ isolation level. InnoDB may struggle to release locks if a crash, timeout, or undo log corruption occurs—causing persistent locking issues even after reboot or snapshot restoration.

How can you speed up InnoDB rollback recovery?

To speed up InnoDB rollback, you should minimize active transactions, increase available IOPS, and monitor the undo logs closely. In severe cases, restoring from a clean backup and carefully rebuilding replication is faster than waiting for InnoDB to finish the rollback, which can take hours or even days.

How does binlog retention affect replication on AWS RDS MariaDB?

On AWS RDS MariaDB, binary logs are not retained based on backup retention settings by default. Missing binlogs can prevent successful replica creation. To avoid replication failures, you must manually set binlog retention using CALL mysql.rds_set_configuration(‘binlog retention hours’, 168);.

Should you use snapshots to fix MariaDB locks on AWS RDS?

No. If an RDS snapshot is taken while a transaction is stuck in rollback, restoring that snapshot will simply reintroduce the problem. The recommended fix is to perform a manual backup (such as mysqldump), restore to a clean instance, and reconfigure replication carefully.