博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Performance Testing: Self-built Databases vs. ApsaraDB for RDS
阅读量:5880 次
发布时间:2019-06-19

本文共 9618 字,大约阅读时间需要 32 分钟。

DataWarehouse_FriendlyDatabaseDesign

Abstract: Poorly configured ApsaraDB instances may appear to underperform compared with self-built databases. This article discusses how to troubleshoot this problem in your databases.

Recent comparisons of self-built databases with RDS show that self-built databases tend to have higher query speeds than those seen on RDS instances. Take a logistics company for example. At the very beginning, the ApsaraDB for RDS SQL server database accesses the RDS instance via ECS and executes a statement which takes about 60s, however the RDS SQL server database accesses the local self-built databases on the ECS in only 2-3s. So, is ApsaraDB RDS really worse than self-built databases? Next, we will discuss the key points for comparing self-built databases with RDS and how to compare their performances fairly.

The following factors should be taken into account when comparing the statement execution performance between self-built databases and RDS:

1. Zones and Network Connections

Please refer to (article in Chinese) for a deep analysis of zones and network connections. If network factors need to be verified, a profiler can be enabled in RDS, a network packet captured in the client, and the execution termination time in the RDS compared with the resulting return time in the network packet. A bigger time difference indicates slower network transmission speeds.

For an RDS SQL Server 2012 instance, the SQL Server Profiler can be enabled to capture events such as RPC:Completed, SQL:StmtStarting, SQL:StmtCompleted, SQL:BatchStarting, and SQL:BatchCompleted.

The SQL Server Profiler cannot currently be enabled for RDS SQL Server 2008 R2 instances, however the following statement can query the start_time and total_elapsed_time for recently executed statements. total_elapased_time represents the total time required to execute the statement after it reaches the SQL Server (the unit is ms).

2. Instance Parameter Configuration

A large variety of parameters need to be taken into account for a MySQL instance. Please refer to (article in Chinese) for detailed analysis and descriptions.

The main parameters that need to be taken into account include fill factor (%), max degree of parallelism, and max server memory (MB).
• Fill Factor (%): A server-side parameter for optimizing data storage and performance. When an index is created or re-created, this value is used to determine the percentage of data space to be filled on each leaf page to reserve room for expanding the index.
• Max Degree of Parallelism (MaxDOP): Limits the number of processors used when a parallel plan is executed, i.e., limiting the degree of parallelism (DOP) of a statement.
• Max Server Memory (MB): Sets an upper limit for memory obtained by a buffer pool.

3. Resource Waiting or Blocking

Comparisons must be made between the two environments to determine if waiting and blocking is occurring during execution of the statement. View information on waiting:

WITH [Waits] AS    (SELECT        [wait_type],        [wait_time_ms] / 1000.0 AS [WaitS],        ([wait_time_ms] - [signal_wait_time_ms]) / 1000.0 AS [ResourceS],        [signal_wait_time_ms] / 1000.0 AS [SignalS],        [waiting_tasks_count] AS [WaitCount],       100.0 * [wait_time_ms] / SUM ([wait_time_ms]) OVER() AS [Percentage],        ROW_NUMBER() OVER(ORDER BY [wait_time_ms] DESC) AS [RowNum]    FROM sys.dm_os_wait_stats    WHERE [wait_type] NOT IN (        N'BROKER_EVENTHANDLER', N'BROKER_RECEIVE_WAITFOR',        N'BROKER_TASK_STOP', N'BROKER_TO_FLUSH',        N'BROKER_TRANSMITTER', N'CHECKPOINT_QUEUE',        N'CHKPT', N'CLR_AUTO_EVENT',        N'CLR_MANUAL_EVENT', N'CLR_SEMAPHORE',  -- Maybe uncomment these four if you have mirroring issues        N'DBMIRROR_DBM_EVENT', N'DBMIRROR_EVENTS_QUEUE',        N'DBMIRROR_WORKER_QUEUE', N'DBMIRRORING_CMD',         N'DIRTY_PAGE_POLL', N'DISPATCHER_QUEUE_SEMAPHORE',        N'EXECSYNC', N'FSAGENT',        N'FT_IFTS_SCHEDULER_IDLE_WAIT', N'FT_IFTSHC_MUTEX',         -- Maybe uncomment these six if you have AG issues        N'HADR_CLUSAPI_CALL', N'HADR_FILESTREAM_IOMGR_IOCOMPLETION',        N'HADR_LOGCAPTURE_WAIT', N'HADR_NOTIFICATION_DEQUEUE',        N'HADR_TIMER_TASK', N'HADR_WORK_QUEUE',   N'KSOURCE_WAKEUP', N'LAZYWRITER_SLEEP',        N'LOGMGR_QUEUE', N'MEMORY_ALLOCATION_EXT',        N'ONDEMAND_TASK_QUEUE',        N'PREEMPTIVE_XE_GETTARGETSTATE',        N'PWAIT_ALL_COMPONENTS_INITIALIZED',        N'PWAIT_DIRECTLOGCONSUMER_GETNEXT',        N'QDS_PERSIST_TASK_MAIN_LOOP_SLEEP', N'QDS_ASYNC_QUEUE',        N'QDS_CLEANUP_STALE_QUERIES_TASK_MAIN_LOOP_SLEEP',        N'QDS_SHUTDOWN_QUEUE', N'REDO_THREAD_PENDING_WORK',        N'REQUEST_FOR_DEADLOCK_SEARCH', N'RESOURCE_QUEUE',        N'SERVER_IDLE_CHECK', N'SLEEP_BPOOL_FLUSH',        N'SLEEP_DBSTARTUP', N'SLEEP_DCOMSTARTUP',        N'SLEEP_MASTERDBREADY', N'SLEEP_MASTERMDREADY',        N'SLEEP_MASTERUPGRADED', N'SLEEP_MSDBSTARTUP',        N'SLEEP_SYSTEMTASK', N'SLEEP_TASK',        N'SLEEP_TEMPDBSTARTUP', N'SNI_HTTP_ACCEPT',        N'SP_SERVER_DIAGNOSTICS_SLEEP', N'SQLTRACE_BUFFER_FLUSH',        N'SQLTRACE_INCREMENTAL_FLUSH_SLEEP',  N'SQLTRACE_WAIT_ENTRIES', N'WAIT_FOR_RESULTS',        N'WAITFOR', N'WAITFOR_TASKSHUTDOWN',        N'WAIT_XTP_RECOVERY',        N'WAIT_XTP_HOST_WAIT', N'WAIT_XTP_OFFLINE_CKPT_NEW_LOG',        N'WAIT_XTP_CKPT_CLOSE', N'XE_DISPATCHER_JOIN',        N'XE_DISPATCHER_WAIT', N'XE_TIMER_EVENT')    AND [waiting_tasks_count] > 0    )SELECT   MAX ([W1].[wait_type]) AS [WaitType],    CAST (MAX ([W1].[WaitS]) AS DECIMAL (16,2)) AS [Wait_S],    CAST (MAX ([W1].[ResourceS]) AS DECIMAL (16,2)) AS [Resource_S],    CAST (MAX ([W1].[SignalS]) AS DECIMAL (16,2)) AS [Signal_S],    MAX ([W1].[WaitCount]) AS [WaitCount],    CAST (MAX ([W1].[Percentage]) AS DECIMAL (5,2)) AS [Percentage],    CAST ((MAX ([W1].[WaitS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgWait_S],    CAST ((MAX ([W1].[ResourceS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgRes_S],    CAST ((MAX ([W1].[SignalS]) / MAX ([W1].[WaitCount])) AS DECIMAL (16,4)) AS [AvgSig_S],    CAST ('https://www.sqlskills.com/help/waits/' + MAX ([W1].[wait_type]) as XML) AS [Help/Info URL]FROM [Waits] AS [W1]INNER JOIN [Waits] AS [W2]    ON [W2].[RowNum] <= [W1].[RowNum]GROUP BY [W1].[RowNum]HAVING SUM ([W2].[Percentage]) - MAX( [W1].[Percentage] ) < 95; -- percentage thresholdGO

Refer to (article in Chinese) for blocking.

4. Whether the index fragmentation rate and statistical information are consistent across the two environments

Follow the below guidelines when checking the index fragment rate of a statement:

SELECT dbschemas.[name] as 'Schema',dbtables.[name] as 'Table',dbindexes.[name] as 'Index',indexstats.avg_fragmentation_in_percent,indexstats.page_countFROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS indexstatsINNER JOIN sys.tables dbtables on dbtables.[object_id] = indexstats.[object_id]INNER JOIN sys.schemas dbschemas on dbtables.[schema_id] = dbschemas.[schema_id]INNER JOIN sys.indexes AS dbindexes ON dbindexes.[object_id] = indexstats.[object_id]AND indexstats.index_id = dbindexes.index_idWHERE indexstats.database_id = DB_ID()ORDER BY indexstats.avg_fragmentation_in_percent desc

A high index fragmentation rate can negatively affect query speed. If the index fragmentation rate is between 5% and 30%, it is recommended to restructure the index. If the index fragmentation rate is greater than 30%, the index should be rebuilt.

Refer to the following statement for checking statistical information:

SELECT t.name TableName, s.[name] StatName, STATS_DATE(t.object_id,s.[stats_id]) LastUpdated  FROM sys.[stats] AS s JOIN sys.[tables] AS t ON [s].[object_id] = [t].[object_id] WHERE t.type = 'u'

If the statistical information in the RDS is older than that in the self-built database, it can then be updated manually, thereby preventing the SQL Server from generating incorrect and inefficient implementation plans based on obsolete statistical information.

5. High Availability Architecture

As a public relational database service, RDS must first be stable, highly available, secure, and capable of delivering secure and stable services for users. The second point is high performance.

RDS SQL Server uses High Safety synchronization to guarantee consistency between the master and slave data. This mode sacrifices some performance compared to High Performance mode, but availability is highly improved and data is thoroughly protected.
Meanwhile, RDS also provides multi-available zone master and slave instances. Dual nodes are located in different machine rooms, thereby further guaranteeing high availability and security.

Troubleshooting:

  1. The RDS instance has more memory than that of the local server.
  2. Network latency is low.
  3. The MAXDOP value of the RDS is 2, and the MAXDOP value for the self-built ECS instance is default, i.e., the parallel statement can be applied to manage as many parallel threads as possible.
    It can be seen from the implementation plan that the DOP for a query statement in RDS is 2, while the DOP for a query statement in the self-built database is 8, meaning that execution on RDS is slower than on the self-built database.

01

02

Solutions:

1.As can be seen above, the RDS configuration is actually higher than that of the self-built database, therefore the MaxDOP value in the parameter settings of the RDS instance can be increased to improve the DOP.

2.A user table is found by the implementation plan to be missing an index. After adding the missing index to RDS, the query performance is significantly improved. Even though the DOP is 2, execution can be completed in 5s.

转载地址:http://qgcix.baihongyu.com/

你可能感兴趣的文章
P1026 统计单词个数
查看>>
[js高手之路] html5 canvas系列教程 - 状态详解(save与restore)
查看>>
poi excel 常用api
查看>>
AD提高动态的方法(附SNR计算)
查看>>
[转]轻松实现可伸缩性,容错性,和负载平衡的大规模多人在线系统
查看>>
五 数组
查看>>
也谈跨域数据交互解决方案
查看>>
EntityFramework中使用Include可能带来的问题
查看>>
面试题28:字符串的排列
查看>>
css important
查看>>
WPF 实现窗体拖动
查看>>
来自维基百科程序员Brandon Harris
查看>>
NULL不是数值
查看>>
CentOS 5 全功能WWW服务器搭建全教程
查看>>
scala111
查看>>
模块化服务规范——OSGI
查看>>
劣质代码评析——猜数字问题(上)
查看>>
纸上谈兵: 栈 (stack)
查看>>
Windows phone8 基础篇(三) 常用控件开发
查看>>
Oracle学习笔记之五,Oracle 11g的PL/SQL入门
查看>>