Zero-Downtime MySQL to Aurora Migration: An Enterprise-Grade AWS DMS Runbook
Modernizing databases is rarely about “just moving data.” It is about consistency, safety, rollback readiness, and zero downtime.
In this post, I walk through a production-proven MySQL → Amazon Aurora MySQL migration using AWS Database Migration Service (DMS) with Full Load + Change Data Capture (CDC).
This guide is based on real-world enterprise migrations and focuses on what actually breaks in production—and how to prevent it.
Components
• Source: MySQL (on-premises or EC2)
• Target: Amazon Aurora MySQL (private subnets)
• Migration Engine: AWS DMS (Full Load + CDC)
• Networking: VPC, private subnets, NAT Gateway
• Access: Bastion host for controlled administration
• Observability: CloudWatch logs and metrics
• Target: Amazon Aurora MySQL (private subnets)
• Migration Engine: AWS DMS (Full Load + CDC)
• Networking: VPC, private subnets, NAT Gateway
• Access: Bastion host for controlled administration
• Observability: CloudWatch logs and metrics
Why This Architecture Works
• No application downtime during bulk load
• Continuous replication via binlogs
• Controlled, reversible cutover
• Safe rollback at every stage
• Continuous replication via binlogs
• Controlled, reversible cutover
• Safe rollback at every stage
Migration Checklist For FULL LOAD + CDC
1. Pre‑Migration Checks (SOURCE MySQL): 1.1: Version & Binlog Validation 1.2: GTID Validation 1.3: Timeout Configuration 1.4: Charset & Collation Validation 1.5: Validate Unsupported and Partially Supported Data Types 1.6: Replication User 1.7: Schema Backup (No Deferred Objects)2. Target Environment (Amazon Aurora MySQL): 2.1: Create Subnet Group & Security Groups 2.2: Create Custom Cluster Parameter Group 2.3: Create Aurora Cluster 2.4: Deploy a Bastion/Jump Host 2.5: Target Schema Creation 2.6: Remove Foreign Key Constraints3. AWS DMS Configuration: 3.1: Replication Instance 3.2: Task Settings 3.3: Table Mappings 3.4: Task Creation 3.5: Pre-Migration Assessment4. Full Load + CDC Execution5. Cutover Procedure: 5.1: CDC Drain Verification 5.2: Freeze Writes 5.3: Row Count Validation 5.4: Rollback Safety 5.5: Cutover Point6. Recreate DEFERRED Objects. 6.1: Recreate Stored Procedures, Functions, Events, Triggers and Views 6.2: Recreate Foreign Keys7. Rollback Strategy8. Start Application Traffic9. Post‑Go‑Live Validation
An Enterprise-Grade, Zero-Downtime Migration Runbook
1. Pre‑Migration Checks (SOURCE MySQL):
1.1: Version & Binlog Validation
1.2: GTID Validation
1.3: Timeout Configuration
1.4: Charset & Collation Validation
1.5: Validate Unsupported and Partially Supported Data Types
1.6: Replication User
1.7: Schema Backup (No Deferred Objects)
2. Target Environment (Amazon Aurora MySQL):
2.1: Create Subnet Group & Security Groups
2.2: Create Custom Cluster Parameter Group
2.3: Create Aurora Cluster
2.4: Deploy a Bastion/Jump Host
2.5: Target Schema Creation
2.6: Remove Foreign Key Constraints
3. AWS DMS Configuration:
3.1: Replication Instance
3.2: Task Settings
3.3: Table Mappings
3.4: Task Creation
3.5: Pre-Migration Assessment
4. Full Load + CDC Execution
5. Cutover Procedure:
5.1: CDC Drain Verification
5.2: Freeze Writes
5.3: Row Count Validation
5.4: Rollback Safety
5.5: Cutover Point
6. Recreate DEFERRED Objects.
6.1: Recreate Stored Procedures, Functions, Events, Triggers and Views
6.2: Recreate Foreign Keys
7. Rollback Strategy
8. Start Application Traffic
9. Post‑Go‑Live Validation
An Enterprise-Grade, Zero-Downtime Migration Runbook
1.1: Version & Binlog Validation
SELECT VERSION();
SHOW VARIABLES LIKE 'log_bin';
SHOW VARIABLES WHERE Variable_name IN ('binlog_format','binlog_row_image','server_id');
SHOW VARIABLES LIKE 'binlog_expire_logs_seconds';
Required values
log_bin = ON
binlog_format = ROW
binlog_row_image = FULL
server_id != 0
Ensure binlog retention is greater than the expected migration + CDC duration to avoid binlog purge during DMS replication.
1.2: GTID Validation
SHOW VARIABLES LIKE 'gtid_mode';
SHOW VARIABLES LIKE 'enforce_gtid_consistency';
GTID ON or OFF is supported, but must be consistent.
GTID mode must remain unchanged for the entire duration of Full Load + CDC.
Do NOT enable or disable GTID while DMS is running.
1.3: Timeout Configuration
SHOW VARIABLES WHERE Variable_name IN('net_read_timeout','net_write_timeout','wait_timeout');
SET GLOBAL net_read_timeout=600
SET GLOBAL net_write_timeout=600
SET GLOBAL wait_timeout=600 (MySQL default is okay also)
1.4 Charset & Collation Validation
SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME
FROM information_schema.SCHEMATA
WHERE SCHEMA_NAME IN ('your_db1','your_db2');
Aurora MySQL 8.0 defaults differ from many MySQL 5.7 sources. Ensure Aurora parameter group matches (e.g. utf8mb4, not latin1).
SQL_MODE comparison (source vs target)
SELECT @@sql_mode;
Compare SQL_MODE between source and Aurora. Differences may affect triggers, functions, or strict inserts.
If sql_mode differs, explicitly align Aurora to source before loading data, especially flags like:
STRICT_TRANS_TABLES
NO_ZERO_DATE
ONLY_FULL_GROUP_BY
1.5: Validate Unsupported and Partially Supported Data Types
AWS DMS pre-migration assessment may report certain MySQL data types as PARTIALLY SUPPORTED.
These do not automatically block migration, but they must be reviewed and validated to ensure application compatibility after cutover.
Data types commonly flagged by AWS DMS (MySQL → Aurora)
Data Type Support Level Notes
-----------------------------------------------------------------------------------------------
ENUM Partially supported Stored as strings; app logic must be validated
SET Partially supported Bitmask behavior differs
GEOMETRY / spatial Partially supported Requires spatial function validation
JSON Supported Version compatibility matters
BIT Supported Casting issues possible
YEAR Supported Formatting differences
Inspect source MySQL existing data types:
SELECT
TABLE_SCHEMA,
TABLE_NAME,
COLUMN_NAME,
DATA_TYPE,
COLUMN_TYPE
FROM information_schema.COLUMNS
WHERE DATA_TYPE IN (
'enum','set','geometry','point','linestring','polygon',
'multipoint','multilinestring','multipolygon',
'geometrycollection','json','bit','year'
)
ORDER BY TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME;
1.6 Replication User
CREATE USER 'dms_user'@'%' IDENTIFIED BY 'StrongPassword123!';
GRANT REPLICATION SLAVE, REPLICATION CLIENT, SELECT, SHOW VIEW ON *.* TO 'dms_user'@'%';
GRANT SELECT ON *.* TO 'dms_user'@'%';
FLUSH PRIVILEGES;
1.7 Schema Backup (No Deferred Objects)
Take the migrating schemas DDL backup without Stored Procecdures, Functions, Events, Triggers and Views.
mysqldump -u root -p db1 --no-data --routines=0 --triggers=0 --events=0 > /path/to/backup/db1_schema.sql
mysqldump -u root -p db2 --no-data --routines=0 --triggers=0 --events=0 > /path/to/backup/db2_schema.sql
2.1: Create Subnet Group & Security Groups
Create DB Subnet Group:
aws rds create-db-subnet-group \
--db-subnet-group-name aurora-prod-subnet-group \
--db-subnet-group-description "Aurora production subnet group" \
--subnet-ids subnet-aaaaaaa subnet-bbbbbbb
Verify the Subnet Groups:
aws rds describe-db-subnet-groups \
--db-subnet-group-name aurora-prod-subnet-group
Create Security Group: (if not exists already)
Here we require four Security Groups (recommended, least-privilege)
SG Used by Purpose Inbound rules
--------------------------------------------------------------------------------------------------------------------
SG-A Bastion host SSH access from your IP Port:22 Source:YOUR_PUBLIC_IP/32
SG-B Aurora cluster MySQL access from DMS, Bastion, App Port:3306 Source:SG-A, SG-B, SG-C
SG-C DMS replication instance MySQL access to source + target Not required
SG-D Application (optional) App → Aurora traffic Not required
Create Bastion SG (SG-A):
BASTION_SG=$(aws ec2 create-security-group \
--group-name sgs-bastion-ssh \
--description "Bastion SSH access" \
--vpc-id vpc-xxxx \
--query GroupId --output text)
aws ec2 authorize-security-group-ingress \
--group-id $BASTION_SG \
--protocol tcp \
--port 22 \
--cidr YOUR_PUBLIC_IP/32
Create Aurora SG (SG-B):
AURORA_SG=$(aws ec2 create-security-group \
--group-name sgs-aurora-mysql \
--description "Aurora MySQL access" \
--vpc-id vpc-xxxx \
--query GroupId --output text)
Allow Bastion → Aurora
aws ec2 authorize-security-group-ingress \
--group-id $AURORA_SG \
--protocol tcp \
--port 3306 \
--source-group $BASTION_SG
Create DMS SG (SG-C):
DMS_SG=$(aws ec2 create-security-group \
--group-name sgs-dms-mysql \
--description "DMS replication access" \
--vpc-id vpc-xxxx \
--query GroupId --output text)
Allow DMS → Aurora
aws ec2 authorize-security-group-ingress \
--group-id $AURORA_SG \
--protocol tcp \
--port 3306 \
--source-group $DMS_SG
(Optional) Create App SG (SG-D):
APP_SG=$(aws ec2 create-security-group \
--group-name sgs-app \
--description "Application outbound to Aurora" \
--vpc-id vpc-xxxx \
--query GroupId --output text)
aws ec2 authorize-security-group-ingress \
--group-id $AURORA_SG \
--protocol tcp \
--port 3306 \
--source-group $APP_SG
2.2: Create Custom Cluster Parameter Group
aws rds create-db-cluster-parameter-group \
--db-cluster-parameter-group-name aurora-mysql-prod-pg \
--db-parameter-group-family aurora-mysql8.0 \
--description "Aurora MySQL prod parameters"
aws rds modify-db-cluster-parameter-group \
--db-cluster-parameter-group-name aurora-mysql-prod-pg \
--parameters \
"ParameterName=character_set_server,ParameterValue=utf8mb4,ApplyMethod=pending-reboot" \
"ParameterName=collation_server,ParameterValue=utf8mb4_0900_ai_ci,ApplyMethod=pending-reboot" \
"ParameterName=time_zone,ParameterValue=UTC,ApplyMethod=pending-reboot" \
"ParameterName=max_allowed_packet,ParameterValue=67108864,ApplyMethod=immediate" \
"ParameterName=log_bin_trust_function_creators,ParameterValue=1,ApplyMethod=immediate" \
"ParameterName=event_scheduler,ParameterValue=ON,ApplyMethod=immediate" \
"ParameterName=binlog_format,ParameterValue=ROW,ApplyMethod=pending-reboot"
Verify parameter values
aws rds describe-db-cluster-parameters \
--db-cluster-parameter-group-name aurora-mysql-prod-pg \
--query "Parameters[?
ParameterName=='character_set_server' ||
ParameterName=='collation_server' ||
ParameterName=='time_zone' ||
ParameterName=='max_allowed_packet' ||
ParameterName=='binlog_format' ||
ParameterName=='log_bin_trust_function_creators' ||
ParameterName=='event_scheduler'
].[ParameterName,ParameterValue,ApplyMethod,IsModifiable]" \
--output table
2.3 Create Aurora Cluster
Create Aurora cluster:
aws rds create-db-cluster \
--db-cluster-identifier aurora-mysql-cluster \
--engine aurora-mysql \
--engine-version 8.0.mysql_aurora.3.04.0 \
--db-cluster-parameter-group-name aurora-mysql-prod-pg \
--master-username admin \
--master-user-password 'StrongPassword123!' \
--database-name appdb \
--db-subnet-group-name aurora-prod-subnet-group \
--vpc-security-group-ids $AURORA_SG \
--backup-retention-period 7 \
--preferred-backup-window 02:00-03:00 \
--preferred-maintenance-window sun:04:00-sun:05:00 \
--storage-encrypted \
--deletion-protection \
--enable-cloudwatch-logs-exports error general slowquery
Add Instances:
aws rds create-db-instance \
--db-instance-identifier aurora-mysql-writer-1 \
--db-instance-class db.t4g.medium \
--engine aurora-mysql \
--db-cluster-identifier aurora-mysql-cluster \
--no-publicly-accessible
aws rds create-db-instance \
--db-instance-identifier aurora-mysql-reader-1 \
--db-instance-class db.t4g.medium \
--engine aurora-mysql \
--db-cluster-identifier aurora-mysql-cluster \
--no-publicly-accessible
Get the Aurora endpoints, this is what DMS needs:
aws rds describe-db-clusters \
--db-cluster-identifier aurora-mysql-cluster \
--query 'DBClusters[0].Endpoint'
Reboot the Aurora cluster for the paramerter changes made earlier to take place.
aws rds reboot-db-instance \
--db-instance-identifier aurora-mysql-writer-1
aws rds reboot-db-instance \
--db-instance-identifier aurora-mysql-reader-1
2.4: Deploy a Bastion/Jump Host
Launch Bastion Host (Public Subnet):
aws ec2 run-instances \
--image-id ami-00d8fc944fb171e29 \
--instance-type t3.micro \
--key-name ec2-key \
--security-group-ids $BASTION_SG \
--subnet-id subnet-xxxx \
--associate-public-ip-address \
--tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=bastion-host}]'
2.5: Target Schema Creation
Connect to Aurora cluster via MySQL client tool like CLI, DBeaver, MySQL Workbench.
Create Databases schemas and execute the DDL backups taken earlier.
CREATE DATABASE db1;
USE db1;
SOURCE /path/to/backup/db1_schema.sql
2.6: Remove Foreign Key Constraints
Remove Tables cascade FKs from Target Aurora
mysql -uadmin -pPassword -h target-aurora
SELECT
CONCAT(
'ALTER TABLE `', kcu.TABLE_SCHEMA, '`.`', kcu.TABLE_NAME,
'` DROP FOREIGN KEY `', kcu.CONSTRAINT_NAME, '`;'
) AS drop_fk_sql
FROM information_schema.KEY_COLUMN_USAGE kcu
WHERE kcu.REFERENCED_TABLE_NAME IS NOT NULL
AND kcu.TABLE_SCHEMA = 'your_schema_name';
Execute the script output in SQL CLI on target Aurora.
3.1 Replication Instance
Create dms-vpc-role:
Trust policy:
cat > dms-vpc-trust.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "dms.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
Create role:
aws iam create-role \
--role-name dms-vpc-role \
--assume-role-policy-document file://dms-vpc-trust.json
Attach AWS-managed policy:
aws iam attach-role-policy \
--role-name dms-vpc-role \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonDMSVPCManagementRole
Create dms-cloudwatch-logs-role:
Trust policy:
cat > dms-cw-trust.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "dms.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
Create role:
aws iam create-role \
--role-name dms-cloudwatch-logs-role \
--assume-role-policy-document file://dms-cw-trust.json
Attach policy:
aws iam attach-role-policy \
--role-name dms-cloudwatch-logs-role \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonDMSCloudWatchLogsRole
Verify roles exist:
aws iam list-roles | grep dms
Create Replication Subnet Group:
aws dms create-replication-subnet-group \
--replication-subnet-group-identifier dms-private-subnet-group \
--replication-subnet-group-description "Private subnets for AWS DMS replication instance" \
--subnet-ids subnet-aaaaaaa subnet-bbbbbbb
aws dms describe-replication-subnet-groups \
--query 'ReplicationSubnetGroups[*].[ReplicationSubnetGroupIdentifier,SubnetGroupStatus,VpcId]' \
--output table
Create Replication Instance (Private Subnet):
aws dms create-replication-instance \
--replication-instance-identifier dms-mysql \
--replication-instance-class dms.c5.large \
--allocated-storage 200 \
--replication-subnet-group-identifier dms-private-subnet-group \
--vpc-security-group-ids $DMS_SG
aws dms describe-replication-instances \
--query 'ReplicationInstances[*].[ReplicationInstanceIdentifier,ReplicationInstanceArn]' \
--output table
Create Source Endpoint. For compliance environments, use verify-ca or verify-full.
aws dms create-endpoint \
--endpoint-identifier src-mysql \
--endpoint-type source \
--engine-name mysql \
--username dms_user \
--password StrongPassword123! \
--server-name EC2_IP/ON_PREM_IP \
--port 3306 \
--ssl-mode verify-ca
Create Aurora Endpoint. For compliance environments, use verify-ca or verify-full.
aws dms create-endpoint \
--endpoint-identifier tgt-aurora \
--endpoint-type target \
--engine-name aurora \
--username admin \
--password StrongPassword123! \
--server-name AURORA_ENDPOINT \
--port 3306 \
--ssl-mode verify-ca
Verify Endpoints:
aws dms describe-endpoints \
--query 'Endpoints[*].[EndpointIdentifier,EndpointType,EngineName,ServerName]' \
--output table
Test Connections(both src-mysql and tgt-aurora):
aws dms test-connection \
--replication-instance-arn arn:aws:dms:region:acct:rep:xxxx \
--endpoint-arn arn:aws:dms:region:acct:endpoint:yyyy
aws dms describe-connections \
--query 'Connections[*].[EndpointArn,Status,LastFailureMessage]' \
--output table
Note: If the source MySQL connection fails check that you have active NAT Gateway configured as the DMS Replication Instance is in a Private Subnet.
aws ec2 describe-route-tables \
--filters Name=association.subnet-id,Values=<DMS_PRIVATE_SUBNET_ID> \
--query 'RouteTables[*].Routes'
3.2: Task Settings
Create AWS DMS task with Partition table migration strategy - set "TargetTablePrepMode": "TRUNCATE_BEFORE_LOAD" or "DO_NOTHING"
Create file name task-settings.json with task rule as required. Tune based on data size and instance class.
cat > task-settings.json <<EOF
{
"FullLoadSettings": {
"TargetTablePrepMode": "DO_NOTHING",
"CreatePkAfterFullLoad": false,
"StopTaskCachedChangesApplied": false,
"StopTaskCachedChangesNotApplied": false,
"MaxFullLoadSubTasks": 8,
"CommitRate": 10000
},
"TargetMetadata": {
"SupportLobs": true,
"FullLobMode": true,
"LobMaxSize": 2000000,
"BatchApplyEnabled": true,
"BatchApplyPreserveTransaction": true
},
"ChangeProcessingDdlHandlingPolicy": {
"HandleSourceTableDropped": false,
"HandleSourceTableTruncated": false,
"HandleSourceTableAltered": true,
"SkipTableSuspensionForPartitionDdl": true
},
"ErrorBehavior": {
"DataErrorPolicy": "LOG_ERROR",
"DataTruncationErrorPolicy": "LOG_ERROR",
"TableErrorEscalationPolicy": "STOP_TASK"
},
"Logging": {
"EnableLogging": true
}
}
EOF
3.3: Table Mappings
Create file name table-mappings.json with table mapping rule to replicate ALL databases except mysql and system databases.
cat > table_mappings.json <<EOF
{
"rules": [
{
"rule-type": "selection",
"rule-id": "1",
"rule-name": "include_all",
"object-locator": {
"schema-name": "%",
"table-name": "%"
},
"rule-action": "include"
},
{
"rule-type": "selection",
"rule-id": "2",
"rule-name": "exclude_mysql",
"object-locator": {
"schema-name": "mysql",
"table-name": "%"
},
"rule-action": "exclude"
},
{
"rule-type": "selection",
"rule-id": "3",
"rule-name": "exclude_sys",
"object-locator": {
"schema-name": "sys",
"table-name": "%"
},
"rule-action": "exclude"
},
{
"rule-type": "selection",
"rule-id": "4",
"rule-name": "exclude_performance_schema",
"object-locator": {
"schema-name": "performance_schema",
"table-name": "%"
},
"rule-action": "exclude"
},
{
"rule-type": "selection",
"rule-id": "5",
"rule-name": "exclude_information_schema",
"object-locator": {
"schema-name": "information_schema",
"table-name": "%"
},
"rule-action": "exclude"
}
]
}
EOF
3.4: Task Creation
aws dms create-replication-task \
--replication-task-identifier mysql-migrate \
--source-endpoint-arn arn:aws:dms:region:acct:endpoint:xxxxx \
--target-endpoint-arn arn:aws:dms:region:acct:endpoint:xxxxx \
--replication-instance-arn arn:aws:dms:region:acct:rep:xxxxx \
--migration-type full-load-and-cdc \
--table-mappings file://table-mappings.json \
--replication-task-settings file://task-settings.json
aws dms describe-replication-tasks \
--query 'ReplicationTasks[*].{ID:ReplicationTaskIdentifier,ARN:ReplicationTaskArn,Status:Status}'
aws dms describe-connections \
--query 'Connections[*].{Endpoint:EndpointIdentifier,Status:Status,Failure:LastFailureMessage}'
Review DMS task settings:
aws dms describe-replication-tasks \
--query 'ReplicationTasks[*].[ReplicationTaskIdentifier,ReplicationTaskArn,Status]' \
--output table
aws dms describe-replication-tasks \
--query 'ReplicationTasks[0].ReplicationTaskSettings' \
--output text \
| jq -r '.' \
| jq '{FullLoadSettings,TargetMetadata,ChangeProcessingDdlHandlingPolicy,Logging}'
3.5: Pre‑Migration Assessment
All **FAILED** items must be fixed before proceeding.
The IAM user or role executing AWS CLI commands must have DMS control-plane permissions to start and review the assessment.
Create the policy file:
cat > dms-assessment-policy.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"dms:StartReplicationTaskAssessment",
"dms:DescribeReplicationTaskAssessmentResults",
"dms:DescribeReplicationTasks",
"dms:DescribeEndpoints",
"dms:DescribeReplicationInstances"
],
"Resource": "*"
}
]
}
EOF
Create and attach the policy:
aws iam create-policy \
--policy-name DMSPreMigrationAssessmentPolicy \
--policy-document file://dms-assessment-policy.json
aws iam attach-user-policy \
--user-name YOUR_IAM_USER_NAME \
--policy-arn arn:aws:iam::<ACCOUNT_ID>:policy/DMSPreMigrationAssessmentPolicy
AWS DMS requires a dedicated service role to write assessment results to an AWS-managed S3 bucket.
Create the trust policy:
cat > dms-access-trust.json <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "dms.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
EOF
Create the role and attach the AWS-managed policy:
aws iam create-role \
--role-name dms-access-for-tasks \
--assume-role-policy-document file://dms-access-trust.json
aws iam attach-role-policy \
--role-name dms-access-for-tasks \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonDMSRedshiftS3Role
Verify the role exists:
aws iam get-role --role-name dms-access-for-tasks
Start the pre-migration assessment (CLI):
aws dms start-replication-task-assessment \
--replication-task-arn arn:aws:dms:region:acct:task:xxxx
Check assessment status and results:
aws dms describe-replication-task-assessment-results \
--replication-task-arn arn:aws:dms:region:acct:task:xxxx
View and parse the assessment report (CLI)
Extract and pretty-print the full JSON report:
aws dms describe-replication-task-assessment-results \
--replication-task-arn arn:aws:dms:region:account-id:task:task-id \
--query 'ReplicationTaskAssessmentResults[0].AssessmentResults' \
--output text \
| jq .
Download the report (optional):
If a local copy is required, use the pre-signed URL returned in S3ObjectUrl:
curl -o assessment-report.json "<S3ObjectUrl>"
jq . assessment-report.json
4. Full Load + CDC Execution
Start Replication Task
aws dms start-replication-task \
--replication-task-arn <TASK_ARN> \
--start-replication-task-type start-replication
Monitor Replication Task in CloudWatch + CLI.
aws dms describe-replication-tasks \
--query 'ReplicationTasks[*].{ID:ReplicationTaskIdentifier,ARN:ReplicationTaskArn,Status:Status}'
See table-level migration statistics:
aws dms describe-table-statistics \
--replication-task-arn arn:aws:dms:region:account-id:task:task-id \
--query 'TableStatistics[*].{
Schema:SchemaName,
Table:TableName,
State:TableState,
FullLoadRows:FullLoadRows,
Inserts:Inserts,
Updates:Updates,
Deletes:Deletes
}' \
--output table
Show only tables NOT completed (Use this to quickly spot problem tables.):
aws dms describe-table-statistics \
--replication-task-arn arn:aws:dms:region:account-id:task:task-id \
--query "TableStatistics[?TableState!='Table completed'].{
Schema:SchemaName,
Table:TableName,
State:TableState
}" \
--output table
Check single table (targeted debugging):
aws dms describe-table-statistics \
--replication-task-arn arn:aws:dms:region:account-id:task:task-id \
--query "TableStatistics[?TableName=='tbl_name'].{
Schema:SchemaName,
Table:TableName,
State:TableState,
FullLoadRows:FullLoadRows
}" \
--output table
Show the exact error for the failed table:
aws dms describe-table-statistics \
--replication-task-arn arn:aws:dms:region:account-id:task:task-id \
--query "TableStatistics[?TableName=='tbl_name'].{
Schema:SchemaName,
Table:TableName,
State:TableState,
Error:LastError,
ErrorCount:ErrorCount
}" \
--output table
aws dms describe-replication-tasks \
--query "ReplicationTasks[?ReplicationTaskArn=='arn:aws:dms:region:account-id:task:task-id'].ReplicationTaskStats"
Show the CloudWatch Error logs:
aws logs describe-log-groups \
--log-group-name-prefix dms-tasks
aws logs describe-log-streams \
--log-group-name <log-group-name> \
--order-by LastEventTime \
--descending
aws logs get-log-events \
--log-group-name <log-group-name> \
--log-stream-name <LOG_STREAM_NAME> \
--limit 50
Use CloudWatch Logs filter and search by table name:
aws logs filter-log-events \
--log-group-name <log-group-name> \
--filter-pattern '"schema.tbl_name"' \
--limit 50 \
--query 'events[*].message' \
--output text
Resume the DMS Migration Task (Requires when table/s caught error during replication):
aws dms describe-replication-tasks \
--query "ReplicationTasks[?ReplicationTaskArn=='arn:aws:dms:region:account-id:task:task-id'].Status"
aws dms start-replication-task \
--replication-task-arn arn:aws:dms:region:account-id:task:task-id \
--start-replication-task-type resume-processing
Show CDC activity only (cutover prep):
aws dms describe-replication-tasks \
--query "ReplicationTasks[?ReplicationTaskArn=='arn:aws:dms:region:account-id:task:task-id'].{
Status:Status,
MigrationType:MigrationType
}" \
--output table
aws dms describe-table-statistics \
--replication-task-arn arn:aws:dms:region:account-id:task:task-id \
--query 'TableStatistics[*].{
Schema:SchemaName,
Table:TableName,
Inserts:Inserts,
Updates:Updates,
Deletes:Deletes
}' \
--output table
5.1 CDC Drain Verification
You must check both DMS-side latency metrics and task state, and then freeze writes briefly to let CDC drain.
To ensure CDC lag = 0 before cutover using AWS CLI, you must check both DMS-side latency metrics and task state, and then freeze writes briefly to let CDC drain.
You must verify ALL three are true:
1. Source latency = 0
2. Target latency = 0
3. No new writes happening on source
If any one is non-zero → do NOT cut over.
Sometimes it may show NULL for latencies then make sure no writes are there on source and follow next steps.
Check CDC lag from AWS CLI:
Run this:
aws dms describe-replication-tasks \
--query "ReplicationTasks[?ReplicationTaskArn=='arn:aws:dms:ap-southeast-1:751836677834:task:G6UG4LTEHVBLFAHMYL63ZEEZPY'].{
FullLoadProgress:ReplicationTaskStats.FullLoadProgressPercent,
TablesLoaded:ReplicationTaskStats.TablesLoaded,
TablesLoading:ReplicationTaskStats.TablesLoading,
TablesErrored:ReplicationTaskStats.TablesErrored,
CDCInserts:ReplicationTaskStats.CdcInserts,
CDCUpdates:ReplicationTaskStats.CdcUpdates,
CDCDeletes:ReplicationTaskStats.CdcDeletes
}" \
--output table
Key fields to watch
{
"FullLoadProgressPercent": 100,
"TablesLoading": 0,
}
aws dms describe-replication-tasks \
--query 'ReplicationTasks[*].{
Task:ReplicationTaskIdentifier,
Status:Status,
SourceLag:ReplicationTaskStats.CDCLatencySource,
TargetLag:ReplicationTaskStats.CDCLatencyTarget
}'
Expected before cutover:
Status = running
SourceLag = 0
TargetLag = 0
Confirm task is in CDC mode (not full load):
aws dms describe-replication-tasks \
--query 'ReplicationTasks[*].{
Task:ReplicationTaskIdentifier,
Status:Status,
FullLoad:ReplicationTaskStats.FullLoadProgressPercent
}'
Must show:
FullLoadProgressPercent = 100
Status = running
5.2 Freeze Writes
Freeze writes on SOURCE (MANDATORY)
CDC lag can bounce between 0–2 seconds forever if writes continue.
App maintenance mode
Disable writes at app layer
SET GLOBAL read_only = ON;
SET GLOBAL super_read_only = ON;
Wait until CDC drains to ZERO
5.3: Row Count Validation
Exact row comparison between source and target.
Run both Source MySQL and Target Aurora and compare:
SET SESSION group_concat_max_len = 1000000;
SELECT
GROUP_CONCAT(
CONCAT(
'SELECT ''', table_name,
''' AS table_name, COUNT(*) AS row_count FROM `',
table_schema, '`.`', table_name, '`'
)
SEPARATOR ' UNION ALL '
) INTO @sql
FROM information_schema.tables
WHERE table_schema = 'employees'
AND table_type = 'BASE TABLE';
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
For partitioned tables, total count may match but partition data may not.
Check per-partition (Aurora and Source MySQL)
SELECT
PARTITION_NAME,
TABLE_ROWS
FROM information_schema.partitions
WHERE table_schema='your_db'
AND table_name='partitioned_table';
5.4: Rollback Safety
Must exercise.
Before stopping DMS, capture binlog position:
aws dms describe-replication-tasks \
--query 'ReplicationTasks[0].ReplicationTaskStats'
Record from output:
LastSourceTransactionTimestamp
LastSourceTransactionID
LastSourceBinlogFile
LastSourceBinlogPosition
5.5: Cutover Point
Re-run:
aws dms describe-replication-tasks \
--query 'ReplicationTasks[*].{
SourceLag:ReplicationTaskStats.CDCLatencySource,
TargetLag:ReplicationTaskStats.CDCLatencyTarget
}'
Wait until:
SourceLag = 0
TargetLag = 0
Hold this state for at least 30–60 seconds.
Sometimes it may show NULL for SourceLag and TargetLag.
If this is your case then perform the steps in 5.3: Row Count Validation.
If the row counts are consistent across source and target you are good to cutover.
Validate no pending changes (extra safe)
aws dms describe-table-statistics \
--replication-task-arn arn:aws:dms:region:acct:task:xxxx \
--query 'TableStatistics[*].{
Table:TableName,
State:TableState,
Inserts:Inserts,
Updates:Updates,
Deletes:Deletes
}'
If these counters stop increasing → CDC is drained.
STOP the task (this is the CUTOVER point)
Once lag = 0 and writes are frozen:
aws dms stop-replication-task \
--replication-task-arn arn:aws:dms:region:acct:task:xxxx
This guarantees:
No more CDC
Target is consistent
6.1: Recreate Stored Procedures, Functions, Events, Triggers and Views.
Phase-1: Prepare the creation scripts.
Extract Stored procedures + functions from SOURCE:
mysqldump -h source-host -u admin -p your_db --routines --no-create-info --no-data --skip-triggers > routines.sql
Extract Triggers from SOURCE:
mysqldump -h source-host -u admin -p your_db --triggers --no-create-info --no-data > triggers.sql
Extract Views from SOURCE:
mysqldump -h source-host -u admin -p your_db --no-data --skip-triggers --no-create-info > views.sql
Alternatively use the SQL on the source MySQL:
SET SESSION group_concat_max_len = 1024000;
Extract procedures from SOURCE:
SELECT
CONCAT(
'DELIMITER $$\n',
'CREATE PROCEDURE `', ROUTINE_SCHEMA, '`.`', ROUTINE_NAME, '` ',
ROUTINE_DEFINITION,
'\n$$\nDELIMITER ;'
) AS create_procedure_sql
FROM information_schema.ROUTINES
WHERE ROUTINE_TYPE = 'PROCEDURE'
AND ROUTINE_SCHEMA = 'your_db';
Extract functions from SOURCE:
SELECT
CONCAT(
'DELIMITER $$\n',
'CREATE FUNCTION `', ROUTINE_SCHEMA, '`.`', ROUTINE_NAME, '` ',
ROUTINE_DEFINITION,
'\n$$\nDELIMITER ;'
) AS create_function_sql
FROM information_schema.ROUTINES
WHERE ROUTINE_TYPE = 'FUNCTION'
AND ROUTINE_SCHEMA = 'your_db';
Extract triggers from SOURCE:
SELECT
CONCAT(
'DELIMITER $$\n',
'CREATE TRIGGER `', TRIGGER_SCHEMA, '`.`', TRIGGER_NAME, '` ',
ACTION_TIMING, ' ', EVENT_MANIPULATION,
' ON `', EVENT_OBJECT_SCHEMA, '`.`', EVENT_OBJECT_TABLE, '` ',
'FOR EACH ROW ',
ACTION_STATEMENT,
'\n$$\nDELIMITER ;'
) AS create_trigger_sql
FROM information_schema.TRIGGERS
WHERE TRIGGER_SCHEMA = 'your_db';
Extract events from SOURCE:
SELECT
CONCAT(
'DELIMITER $$\n',
'CREATE EVENT `', EVENT_SCHEMA, '`.`', EVENT_NAME, '` ',
'ON SCHEDULE ', EVENT_DEFINITION,
' DO ', EVENT_BODY,
'\n$$\nDELIMITER ;'
) AS create_event_sql
FROM information_schema.EVENTS
WHERE EVENT_SCHEMA = 'your_db';
Extract views from SOURCE:
SELECT
CONCAT(
'CREATE OR REPLACE VIEW `', TABLE_SCHEMA, '`.`', TABLE_NAME, '` AS ',
VIEW_DEFINITION,
';'
) AS create_view_sql
FROM information_schema.VIEWS
WHERE TABLE_SCHEMA = 'your_db';
DEFINER FIX (CRITICAL for Aurora): Failure to fix DEFINER will cause silent runtime failures in Aurora.
Replace definers using Aurora current user for each output of above scripts:
sed -i 's/DEFINER=`[^`]*`@`[^`]*`/DEFINER=CURRENT_USER/g' file_name.sql
Phase-2: Apply the scripts on Aurora with below order
1. Stored Procedures
2. Functions
3. Views
4. Triggers
5. Events
Phase-3: Validation after apply (REQUIRED)
On Aurora:
SHOW PROCEDURE STATUS WHERE Db='your_db';
SHOW FUNCTION STATUS WHERE Db='your_db';
SHOW TRIGGERS FROM your_db;
SHOW EVENTS FROM your_db;
SHOW FULL TABLES IN your_db WHERE Table_type='VIEW';
6.2: Recreate Foreign Keys
Phase 1: Extract FK Metadata from Source MySQL
Run this on SOURCE MySQL.
SELECT
kcu.CONSTRAINT_SCHEMA,
kcu.TABLE_NAME,
kcu.CONSTRAINT_NAME,
GROUP_CONCAT(kcu.COLUMN_NAME ORDER BY kcu.ORDINAL_POSITION) AS child_columns,
kcu.REFERENCED_TABLE_NAME,
GROUP_CONCAT(kcu.REFERENCED_COLUMN_NAME ORDER BY kcu.ORDINAL_POSITION) AS parent_columns,
rc.UPDATE_RULE,
rc.DELETE_RULE
FROM information_schema.KEY_COLUMN_USAGE kcu
JOIN information_schema.REFERENTIAL_CONSTRAINTS rc
ON kcu.CONSTRAINT_SCHEMA = rc.CONSTRAINT_SCHEMA
AND kcu.CONSTRAINT_NAME = rc.CONSTRAINT_NAME
WHERE kcu.REFERENCED_TABLE_NAME IS NOT NULL
GROUP BY
kcu.CONSTRAINT_SCHEMA,
kcu.TABLE_NAME,
kcu.CONSTRAINT_NAME,
kcu.REFERENCED_TABLE_NAME,
rc.UPDATE_RULE,
rc.DELETE_RULE
ORDER BY kcu.CONSTRAINT_SCHEMA, kcu.TABLE_NAME;
Phase 2: Generate FK Recreation SQL (AUTO-GENERATED)
This query outputs executable ALTER TABLE statements.
SELECT
CONCAT(
'ALTER TABLE `', TABLE_NAME, '` ',
'ADD CONSTRAINT `', CONSTRAINT_NAME, '` ',
'FOREIGN KEY (`', REPLACE(child_columns, ',', '`,`'), '`) ',
'REFERENCES `', REFERENCED_TABLE_NAME, '` (`',
REPLACE(parent_columns, ',', '`,`'), '`) ',
'ON DELETE ', DELETE_RULE, ' ',
'ON UPDATE ', UPDATE_RULE, ';'
) AS fk_ddl
FROM (
SELECT
kcu.TABLE_NAME,
kcu.CONSTRAINT_NAME,
GROUP_CONCAT(kcu.COLUMN_NAME ORDER BY kcu.ORDINAL_POSITION) AS child_columns,
kcu.REFERENCED_TABLE_NAME,
GROUP_CONCAT(kcu.REFERENCED_COLUMN_NAME ORDER BY kcu.ORDINAL_POSITION) AS parent_columns,
rc.UPDATE_RULE,
rc.DELETE_RULE
FROM information_schema.KEY_COLUMN_USAGE kcu
JOIN information_schema.REFERENTIAL_CONSTRAINTS rc
ON kcu.CONSTRAINT_SCHEMA = rc.CONSTRAINT_SCHEMA
AND kcu.CONSTRAINT_NAME = rc.CONSTRAINT_NAME
WHERE kcu.REFERENCED_TABLE_NAME IS NOT NULL
AND kcu.CONSTRAINT_SCHEMA = 'your_database'
GROUP BY
kcu.TABLE_NAME,
kcu.CONSTRAINT_NAME,
kcu.REFERENCED_TABLE_NAME,
rc.UPDATE_RULE,
rc.DELETE_RULE
) x;
Save output as:
fk_recreate.sql
Phase 3: Apply Safely on Aurora MySQL
Disable FK checks temporarily
SET foreign_key_checks = 0;
Execute FK script
mysql -h aurora-endpoint -u admin -p your_database < fk_recreate.sql
Re-enable FK checks
SET foreign_key_checks = 1;
Phase 4: Validation Queries (MANDATORY)
Validate FK Count
SELECT COUNT(*)
FROM information_schema.REFERENTIAL_CONSTRAINTS
WHERE CONSTRAINT_SCHEMA = 'your_database';
Validate FK Integrity
SELECT
rc.CONSTRAINT_NAME,
rc.TABLE_NAME,
rc.REFERENCED_TABLE_NAME
FROM information_schema.REFERENTIAL_CONSTRAINTS rc
WHERE rc.CONSTRAINT_SCHEMA = 'your_database';
7. Rollback Strategy
Phase-1: Source remains read‑only for N hours
On Source
SET GLOBAL read_only = ON;
SET GLOBAL super_read_only = ON;
Phase-2: Aurora snapshot taken before app switch
aws rds create-db-cluster-snapshot \
--db-cluster-identifier aurora-mysql-cluster \
--db-cluster-snapshot-identifier aurora-pre-app-switch-$(date -u +%Y%m%d-%H%M%S)
Wait until the snapshot status is available before switching the app.
aws rds describe-db-cluster-snapshots \
--db-cluster-identifier aurora-cluster-prod \
--query 'DBClusterSnapshots[*].[DBClusterSnapshotIdentifier,Status,SnapshotCreateTime]' \
--output table
8: Start Application Traffic
Perform smoke tests.
After rollback or success on Source MySQL:
SET GLOBAL read_only = OFF;
SET GLOBAL super_read_only = OFF;
9. Post‑Go‑Live Validation
Query latency comparison
Enable slow query log
Monitor errors for 24–48 hours
Decommission the Source MySQL




Comments
Post a Comment