asm rebalance 三个阶段

zz/2023/6/4 16:04:24

The disk group rebalance operation has three phases:

  • Planning -一般30秒内
  • File extents relocation --add 3块2t盘- 1T数据 2小时 drop 1T 1.5小时 add 10块2t盘- 8T数据 8小时 drop 5T 8小时
  • Disk compacting --drop 3块2t盘-1T 40分钟 ,add 8T 4小时

GOAL

What's compact phase during rebalance?

 

SOLUTION

+ The compact phase is part of rebalance operation, it moves the data as close as possible to the outer tracks of the disks (the lower numbered offsets).

The first time you run rebalance with 11g, it could take a while if the disk group configuration changed (especially via ADD DISK) when running with 10g ASM. Subsequent manual rebalance without a configuration change should not take as much time.

A disk group where the compact phase of rebalance has done a lot of work will tend to have better performance than the pre-compact disk group. The data should be clustered near the higher performing tracks of the disk, resulting less seektime .

 

+ It's enabled by default from 11g onwards.

 

+ It generally takes place at the end of rebalance operation.

Before 12c, we cannot see compact phase on v$asm_operation view at asm level. If one see EST_MINUTES shows as '0' and waiting for long time, probably its is doing compact. This can be confirmed by seeing system state dump from ASM level and mostly we will see no blocking session and waits are "kfk:async IO"

From 12c onwards, we can see compact phase as separate operation .

 

+ Compact phase can be disabled.

Before 12c, use hidden parameter _disable_rebalance_compact=true at instance level .

From 12c onwards, _disable_rebalance_compact parameter is no longer available, however Diskgroup attribute _rebalance_compact can be used:

SQL> ALTER DISKGROUP <dg> SET ATTRIBUTE '_rebalance_compact'='FALSE';

 

When Will the Rebalance Complete (Doc ID 1477905.1)

Oracle Database - Enterprise Edition - Version 10.2.0.1 to 12.2.0.1 [Release 10.2 to 12.2]
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Information in this document applies to any platform.

PURPOSE

 This document is about disk group rebalance in Oracle Automatic Storage Management (ASM). The intent is to provide some insight into the disk group rebalance process,  how to check if the rebalance is progressing and to assist in determining when will rebalance complete.

DETAILS

Introduction

The disk group rebalance operation has three phases:

  • Planning
  • File extents relocation
  • Disk compacting

As far as the overall time to complete is concerned, the planning phase time is insignificant so there is no need to worry about it. The file extent relocation phase will take most of the time, so the main focus will be on that. The disk compacting phase may also take significant amount of time, in particular on disk add, so we will have a closer look at that as well.

It is important to understand why the rebalance is running. If we are adding a new disk, say to increase the available disk group space, it doesn't really matter how long it will take for the rebalance to complete. Similarly if we are resizing or dropping disk(s), to adjust the disk group space, we are generally not concerned with the time it takes for the rebalance to complete.

But if a disk has failed and ASM has initiated rebalance, there may be a legitimate reason for concern. If the disk group is normal redundancy AND if another disk fails AND it is the partner of the disk that has already failed, the disk group will be dismounted, all the databases that use that disk group will crash and there may be loss of data. In such cases it may be important to have an idea when the rebalance operation will complete. Actually, we want to see the file extents relocation phase completed, as once it does, all the data is fully redundant again (in case the rebalance was initiated due to a disk failure).

File extents relocation

To have a closer look at the file extents relocation phase, I drop one of the disks with the default rebalance power. I then query GV$ASM_OPERATION to check the estimated completion time (EST_MINUTES):

SQL> show parameter power

NAME                                 TYPE        VALUE
------------------------------------ ----------- ----------------
asm_power_limit                      integer     1

SQL> set time on
16:40:57 SQL> alter diskgroup DATA1 drop disk DATA1_CD_06_CELL06;

Diskgroup altered.

Initial estimated time to complete is 26 minutes:
16:41:21 SQL> select INST_ID, OPERATION, STATE, POWER, SOFAR, EST_WORK, EST_RATE, EST_MINUTES from GV$ASM_OPERATION where GROUP_NUMBER=1;

   INST_ID OPERA STAT      POWER      SOFAR   EST_WORK   EST_RATE EST_MINUTES
---------- ----- ---- ---------- ---------- ---------- ---------- -----------
         3 REBAL WAIT          1
         2 REBAL RUN           1        516      53736       2012          26
         4 REBAL WAIT          1

 
About 10 minutes into the rebalance, the estimate to complete is 24 minutes:

16:50:25 SQL> /

   INST_ID OPERA STAT      POWER      SOFAR   EST_WORK   EST_RATE EST_MINUTES
---------- ----- ---- ---------- ---------- ---------- ---------- -----------
         3 REBAL WAIT          1
         2 REBAL RUN           1      19235      72210       2124          24
         4 REBAL WAIT          1

 
While that EST_MINUTES doesn't give me much confidence, I see that SOFAR (number of allocation units moved so far) is going up, which is a good sign.

ASM alert log shows the time of the drop disk, the OS process ID of the ARB0 doing all the work, and most importantly - that there are no errors:

Wed Jul 11 16:41:15 2012
SQL> alter diskgroup DATA1 drop disk DATA1_CD_06_CELL06
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
...
NOTE: starting rebalance of group 1/0x6ecaf3e6 (DATA1) at power 1
Starting background process ARB0
Wed Jul 11 16:41:24 2012
ARB0 started with pid=41, OS id=58591
NOTE: assigning ARB0 to group 1/0x6ecaf3e6 (DATA1) with 1 parallel I/O
NOTE: F1X0 copy 3 relocating from 0:2 to 55:35379 for diskgroup 1 (DATA1)
...


ARB0 trace file should show which file extents are being relocated. It does, and that is how I know that ARB0 is doing what it is supposed to do:

$ tail -f <ASM Trace Directory>/+ASM2_arb0_58591.trc
...
ARB0 relocating file +DATA1.282.788356359 (120 entries)
*** 2012-07-11 16:48:44.808
ARB0 relocating file +DATA1.283.788356383 (120 entries)
...
*** 2012-07-11 17:13:11.761
ARB0 relocating file +DATA1.316.788357201 (120 entries)
*** 2012-07-11 17:13:16.326
ARB0 relocating file +DATA1.316.788357201 (120 entries)
...


Note that there may be lot of arb0 trace files in the trace directory, so that's why we need to know the OS process ID of the ARB0 actually doing the rebalance. That information is in the alert log of the ASM instance performing the rebalance.

I can also look at the pstack of the ARB0 process to see what is going on. It does show me that ASM is relocating extents (key functions on the stack being kfgbRebalExecute - kfdaExecute - kffRelocate):

# pstack 58591
#0  0x0000003957ccb6ef in poll () from /lib64/libc.so.6
...
#9  0x0000000003d711e0 in kfk_reap_oss_async_io ()
#10 0x0000000003d70c17 in kfk_reap_ios_from_subsys ()
#11 0x0000000000aea50e in kfk_reap_ios ()
#12 0x0000000003d702ae in kfk_io1 ()
#13 0x0000000003d6fe54 in kfkRequest ()
#14 0x0000000003d76540 in kfk_transitIO ()
#15 0x0000000003cd482b in kffRelocateWait ()
#16 0x0000000003cfa190 in kffRelocate ()
#17 0x0000000003c7ba16 in kfdaExecute ()
#18 0x0000000003d4beaa in kfgbRebalExecute ()
#19 0x0000000003d39627 in kfgbDriver ()
#20 0x00000000020e8d23 in ksbabs ()
#21 0x0000000003d4faae in kfgbRun ()
#22 0x00000000020ed95d in ksbrdp ()
#23 0x0000000002322343 in opirip ()
#24 0x0000000001618571 in opidrv ()
#25 0x0000000001c13be7 in sou2o ()
#26 0x000000000083ceba in opimai_real ()
#27 0x0000000001c19b58 in ssthrdmain ()
#28 0x000000000083cda1 in main ()


After about 35 minutes the EST_MINUTES drops to 0:

17:16:54 SQL> /

   INST_ID OPERA STAT      POWER      SOFAR   EST_WORK   EST_RATE EST_MINUTES
---------- ----- ---- ---------- ---------- ---------- ---------- -----------
         2 REBAL RUN           1      74581      75825       2129           0
         3 REBAL WAIT          1
         4 REBAL WAIT          1

 

And soon after that, the ASM alert log shows:

  • Disk emptied
  • Disk header erased
  • PST update completed successfully
  • Disk closed
  • Rebalance completed

Wed Jul 11 17:17:32 2012
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
Wed Jul 11 17:17:41 2012
GMON updating for reconfiguration, group 1 at 20 for pid 38, osid 93832
NOTE: group 1 PST updated.
SUCCESS: grp 1 disk DATA1_CD_06_CELL06 emptied
NOTE: erasing header on grp 1 disk DATA1_CD_06_CELL06
NOTE: process _x000_+asm2 (93832) initiating offline of disk 0.3916039210 (DATA1_CD_06_CELL06) with mask 0x7e in group 1
NOTE: initiating PST update: grp = 1, dsk = 0/0xe96a042a, mask = 0x6a, op = clear
GMON updating disk modes for group 1 at 21 for pid 38, osid 93832
NOTE: PST update grp = 1 completed successfully
NOTE: initiating PST update: grp = 1, dsk = 0/0xe96a042a, mask = 0x7e, op = clear
GMON updating disk modes for group 1 at 22 for pid 38, osid 93832
NOTE: cache closing disk 0 of grp 1: DATA1_CD_06_CELL06
NOTE: PST update grp = 1 completed successfully
GMON updating for reconfiguration, group 1 at 23 for pid 38, osid 93832
NOTE: cache closing disk 0 of grp 1: (not open) DATA1_CD_06_CELL06
NOTE: group 1 PST updated.
Wed Jul 11 17:17:41 2012
NOTE: membership refresh pending for group 1/0x6ecaf3e6 (DATA1)
GMON querying group 1 at 24 for pid 19, osid 38421
GMON querying group 1 at 25 for pid 19, osid 38421
NOTE: Disk  in mode 0x8 marked for de-assignment
SUCCESS: refreshed membership for 1/0x6ecaf3e6 (DATA1)
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 1/0x6ecaf3e6 (DATA1)
NOTE: Attempting voting file refresh on diskgroup DATA1


The estimated time was 26 minutes and the rebalance actually took about 36 minutes (in this particular case the disk compacting took less than a minute so I have ignored it). That is why it is more important to understand what is going on, then to know when the rebalance will complete.

Note that the estimated time may also be increasing. If the system is under heavy load, the rebalance will take more time - especially with the rebalance power 1. For a large disk group (many TB) and large number of files, the rebalance can take hours and possibly days.

If you want to get an idea how long will a drop disk take in your environment, you need to test it. Just drop one of the disks, while your system is under normal/typical load. Your data is fully redundant during such disk drop, so you are not exposed to a disk group dismount in case its partner disk fails during the rebalance.

Disk compacting

To look at the disk compacting phase, I add the same disk back, with rebalance power 10:

17:26:48 SQL> alter diskgroup DATA1 add disk '/o/*/DATA1_CD_06_celll06' rebalance power 10;

Diskgroup altered.

Initial estimated time to complete is 6 minutes:
17:27:22 SQL> select INST_ID, OPERATION, STATE, POWER, SOFAR, EST_WORK, EST_RATE, EST_MINUTES from GV$ASM_OPERATION where GROUP_NUMBER=1;

   INST_ID OPERA STAT      POWER      SOFAR   EST_WORK   EST_RATE EST_MINUTES
---------- ----- ---- ---------- ---------- ---------- ---------- -----------
         2 REBAL RUN          10        489      53851       7920           6
         3 REBAL WAIT         10
         4 REBAL WAIT         10


After about 10 minutes, the EST_MINUTES drops to 0:

17:39:05 SQL> /

   INST_ID OPERA STAT      POWER      SOFAR   EST_WORK   EST_RATE EST_MINUTES
---------- ----- ---- ---------- ---------- ---------- ---------- -----------
         3 REBAL WAIT         10
         2 REBAL RUN          10      92407      97874       8716           0
         4 REBAL WAIT         10


And I see the following in the ASM alert log

Wed Jul 11 17:39:49 2012
NOTE: GroupBlock outside rolling migration privileged region
NOTE: requesting all-instance membership refresh for group=1
Wed Jul 11 17:39:58 2012
GMON updating for reconfiguration, group 1 at 31 for pid 43, osid 115117
NOTE: group 1 PST updated.
Wed Jul 11 17:39:58 2012
NOTE: membership refresh pending for group 1/0x6ecaf3e6 (DATA1)
GMON querying group 1 at 32 for pid 19, osid 38421
SUCCESS: refreshed membership for 1/0x6ecaf3e6 (DATA1)
NOTE: Attempting voting file refresh on diskgroup DATA1


That means that ASM has completed the file extents relocation phase of the rebalance and has started the disk compacting phase. If that is true, we should see the kfdCompact() function on the stack. And we do:

# pstack 103326
#0  0x0000003957ccb6ef in poll () from /lib64/libc.so.6
...
#9  0x0000000003d711e0 in kfk_reap_oss_async_io ()
#10 0x0000000003d70c17 in kfk_reap_ios_from_subsys ()
#11 0x0000000000aea50e in kfk_reap_ios ()
#12 0x0000000003d702ae in kfk_io1 ()
#13 0x0000000003d6fe54 in kfkRequest ()
#14 0x0000000003d76540 in kfk_transitIO ()
#15 0x0000000003cd482b in kffRelocateWait ()
#16 0x0000000003cfa190 in kffRelocate ()
#17 0x0000000003c7ba16 in kfdaExecute ()
#18 0x0000000003c4b737 in kfdCompact ()
#19 0x0000000003c4c6d0 in kfdExecute ()
#20 0x0000000003d4bf0e in kfgbRebalExecute ()
#21 0x0000000003d39627 in kfgbDriver ()
#22 0x00000000020e8d23 in ksbabs ()
#23 0x0000000003d4faae in kfgbRun ()
#24 0x00000000020ed95d in ksbrdp ()
#25 0x0000000002322343 in opirip ()
#26 0x0000000001618571 in opidrv ()
#27 0x0000000001c13be7 in sou2o ()
#28 0x000000000083ceba in opimai_real ()
#29 0x0000000001c19b58 in ssthrdmain ()
#30 0x000000000083cda1 in main ()


The tail on the current ARB0 trace file now shows relocating just 1 allocation unit (1 entries) at the time (another sign of the disk compacting phase):

$ tail -f <ASM Trace Directory>/+ASM2_arb0_103326.trc
ARB0 relocating file +DATA1.321.788357323 (1 entries)
ARB0 relocating file +DATA1.321.788357323 (1 entries)
ARB0 relocating file +DATA1.321.788357323 (1 entries)
...


The V$ASM_OPERATION keeps showing EST_MINUTES=0 during the whole time of the disk compacting (while not helpful, this is normal and expected):

17:42:39 SQL> /

   INST_ID OPERA STAT      POWER      SOFAR   EST_WORK   EST_RATE EST_MINUTES
---------- ----- ---- ---------- ---------- ---------- ---------- -----------
         3 REBAL WAIT         10
         4 REBAL WAIT         10
         2 REBAL RUN          10      98271      98305       7919           0

 
The X$KFGMG view shows REBALST_KFGMG=2 (yet another confirmation of the disk compacting phase):

17:42:50 SQL> select NUMBER_KFGMG, OP_KFGMG, ACTUAL_KFGMG, REBALST_KFGMG from X$KFGMG;

NUMBER_KFGMG   OP_KFGMG ACTUAL_KFGMG REBALST_KFGMG
------------ ---------- ------------ -------------
           1          1           10             2

 
Once the compacting phase completes, the alert log shows "stopping process ARB0" and "rebalance completed":

Wed Jul 11 17:43:48 2012
NOTE: stopping process ARB0
SUCCESS: rebalance completed for group 1/0x6ecaf3e6 (DATA1)


In this case, the file extents relocation phase took about 12 minutes and the disk compacting phase took about 4 minutes.

The compacting phase can actually take significant amount of time. In one case I have seen the file extents relocation run for 60 minutes and the disk compacting after that took another 30 minutes. But it doesn't really matter how long it takes for the compacting to complete, because as soon as the file extents relocation completes, all data is fully redundant and we are not exposed to disk group dismount due to a partner disk failure.

Adjusting the rebalance power

The rebalance power can be adjusted dynamically, i.e. during the rebalance. If the rebalance with the default power is 'too slow', the power can be increased. How much? To answer that questions, we need to understand the I/O load, the I/O throughput and most importantly the I/O limits the system can take. If we don't know that, the power can be increased to 5 (with 'ALTER DISKGROUP ... REBALANCE POWER 5;'). We can then check if that makes a difference. Should we go any higher with the rebalance power? Again, as long as we are not adversely impacting the database I/O performance, we can keep increasing the power. But I haven't seen much improvement beyond power 30. Note that the power can go up to 11 with disk groups with COMPATIBLE.ASM<11.2.0.2 and up to 1024 for disk groups with COMPATIBLE.ASM=>11.2.0.2.

The testing is the key here. We really need to test the rebalance under the regular production load, with different values for the power. There is no point testing with no databases running or on a system with different storage characteristics.

References

Oracle® Automatic Storage Management Administrator's Guide 11g Release 2 (11.2)
Chapter 1 Introduction to Oracle Automatic Storage Management
About Online Storage Reconfigurations and Dynamic Rebalancing

Oracle® Automatic Storage Management Administrator's Guide 11g Release 2 (11.2)
Chapter 4 Administering Oracle ASM Disk Groups
Manually Rebalancing Disk Groups
Tuning Rebalance Operations

Oracle® Database Reference 11g Release 2 (11.2)
V$ASM_OPERATION

Oracle Sun Database Machine X2-2/X2-8 High Availability Best Practices (Doc ID 1274322.1)
Section Check ASM rebalance forward progress if you suspect a problem
Shell script rebalance_file_progress.sh

REFERENCES

NOTE:1274322.1 - Oracle Exadata High Availability Best Practices
BUG:21158299 - UNBALANCED FAILGROUPS AND MIRRORING IS BROKEN

Can ASM “_DISABLE_REBALANCE_COMPACT=TRUE" Be Used With NetApp SAN Environment? (Doc ID 1573768.1)

APPLIES TO:

Oracle Database - Enterprise Edition - Version 11.2.0.3 and later
Oracle Database Cloud Schema Service - Version N/A and later
Oracle Database Exadata Cloud Machine - Version N/A and later
Oracle Cloud Infrastructure - Database Service - Version N/A and later
Oracle Database Exadata Express Cloud Service - Version N/A and later
Information in this document applies to any platform.

SYMPTOMS

28TB sized ASM disk group had 4TB added. At a rebalance power of 10 (ASM compatibility is 11.2.0.0) 3 hours was spent in the second phase (extents rebalance) and an additional 3 hours in the third phase (compacting). Each LUN is 2TB in size and the new LUNs added together in one command. No databases were served by the particular ASM instance used to add the LUNs. The additional time spent in the compacting phase is not shown in the estimated rebalance time and impacted the batch process, so consider setting “_DISABLE_REBALANCE_COMPACT=TRUE” for the ASM instances connected to the NetApp FAS6080 SAN. Is there any reason not to put this setting in place?
 

CAUSE

No databases were served by the particular ASM instance used to add the LUNs.
 

SOLUTION

We advise to set _DISABLE_REBALANCE_COMPACT=TRUE in such environment. Setting initialization parameter _DISABLE_REBALANCE_COMPACT=TRUE will disable the compacting phase of the disk group rebalance - for all disk groups.

-bash-4.1$ sqlplus / as sysasmSQL*Plus: Release 11.2.0.4.0 Production on Wed Dec 30 22:50:40 2020Copyright (c) 1982, 2013, Oracle.  All rights reserved.Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management optionsSQL> alter system set "_disable_rebalance_compact"=true scope=both sid='*';System altered.SQL> show parameter disable;NAME                                 TYPE
------------------------------------ ----------------------
VALUE
------------------------------
_disable_rebalance_compact           boolean
TRUE

相关参考:

https://blog.csdn.net/fanzhuozhuo/article/details/106806645

https://cloud.tencent.com/developer/article/1052512

http://www.ngui.cc/zz/2389894.html

相关文章

rebalance的使用

上篇&#xff1a;project的使用 rebalance the output elements are distributed evenly to instances of the next operation in a round-robin fashion 按照round-robin的方式&#xff0c;决定上游算子的某个并发的数据发往下游的哪个并发。该方法可以保证从上游算子到下游…

Kafka rebalance 重平衡深度解析

文章目录rebalance 触发条件分区分配策略rebalance generation消费者状态机rebalance 协议消费者端 rebalance 流程Broker 端重平衡场景解析新成员入组组成员主动离场组成员崩溃离场重平衡时协调者对组内成员提交位移的处理rebalance 监听器consumer group 是用于实现高伸缩性、…

kafka消费者Rebalance机制

目录 1、Rebalance机制 2、消费者Rebalance分区分配策略 3、Rebalance过程 1、Rebalance机制 rebalance就是说如果消费组里的消费者数量有变化或消费的分区数有变化&#xff0c;kafka会重新分配消费者消费分区的关系。比如consumer group中某个消费者挂了&#xff0c;此时会…

RocketMQ源码(十九)之消费者Rebalance

文章目录版本简介Broker端ConsumerManagerConsumerOffsetManagerSubscriptionGroupManager消费端RebalanceService分配策略版本 基于rocketmq-all-4.3.1版本 简介 集群消息同一个消费组只能有一个消费者消费&#xff0c;如果一个Topic有4个MessageQueue&#xff0c;对于Consu…

oracle rebalance参数,【案例】Oracle ASM扩展新LAN加入asm diskgroup asm rebalance 原理

天萃荷净Oracle研究中心案例分析&#xff1a;运维DBA反映Oracle数据库的ASM空间不足&#xff0c;需要扩展。通过划新的LAN加入asm diskgroup并分析asm rebalance 原理。本站文章除注明转载外&#xff0c;均为本站原创&#xff1a; 转载自love wife & love life —Roger 的O…

HDFS Rebalance 介绍

原文&#xff1a;https://blog.csdn.net/xiaofei0859/article/details/49763705 HDFS中的数据按照一定策略分布在集群中的多个数据节点上&#xff0c;但在某些情况下&#xff0c;数据的分布也会出现不均衡的情况&#xff0c;比如说集群新增加了节点&#xff0c;在新增加的节点上…

oracle rebalance参数,深入内核:Asm Rebalance 原理 SHAPE

深入内核&#xff1a;Asm Rebalance 原理SHAPE \* MERGEFORMAT编辑手记&#xff1a;ASM Rebalance 的过程具体发生了什么操作呢&#xff0c;在不同版本间有什么样的区别&#xff0c;如何才能加快 Rebalance 的速度呢&#xff0c;本文将会解答你的困惑我们先看一个例子某客户进行…

HDFS的Rebalance功能

HDFS中的数据按照一定策略分布在集群中的多个数据节点上&#xff0c;但在某些情况下&#xff0c;数据的分布也会出现不均衡的情况&#xff0c;比如说集群新增加了节点&#xff0c;在新增加的节点上就没有数据存在&#xff0c;虽说之后新增的数据会分配到新节点上&#xff0c;不…

kafka的rebalance

rebalance的出现 订阅Topic的分区数发生变化 简单地说&#xff0c;就是之前 topic 有 10 个分区&#xff0c;现在变成了 20 个&#xff0c;那么多出来的 10 个分区的数据就没人消费了。那么此时就需要进行重平衡&#xff0c;将新增的 10 个分区分给消费组内的消费者进行消费。…

hadoop rebalance

之前一直没做过rebalance&#xff0c;以为速度很快&#xff0c;结果大意了&#xff0c;等到磁盘达到90%的时候&#xff0c;才开始做rebalance。 默认的从日志中可以看到总共需要迁移1.89T&#xff0c;但是每次只移动40G大小的量。 然后查看40G的数据量从15:45分到15:48分&#…