oracle12.2 asm进程,Oracle ASM Rebalance执行过程

zz/2024/5/23 5:23:49

磁盘组的rebalance什么时候能完成?这没有一个具体的数值,但ASM本身已经给你提供了一个估算值(GV$ASM_OPERATION.EST_MINUTES),想知道rebalance完成的精确的时间,虽然不能给出一个精确的时间,但是可以查看一些rebalance的操作细节,让你知道当前rebalance是否正在进行中,进行到哪个阶段,以及这个阶段是否需要引起你的关注。

理解rebalance

rebalance操作本身包含了3个阶段-planning, extents relocation 和 compacting,就rebalance需要的总时间而言,planning阶段需要的时间是非常少的,你通常都不用去关注这一个阶段,第二个阶段extent relocation一般会占取rebalance阶段的大部分时间,也是我们最为需要关注的阶段,最后我们也会讲述第三阶段compacting阶段在做些什么。

首先需要明白为什么会需要做rebalance,如果你为了增加磁盘组的可用空间,增加了一块新磁盘或者为了调整磁盘的空间,例如resizing或者删除磁盘,你可能也不会太去关注rebalance啥时候完成。但是,如果磁盘组中的一块磁盘损坏了,这个时候你就有足够的理由关注rebalance的进度了,假如,你的磁盘组是normal冗余的,这个时候万一你损坏磁盘的partner磁盘也损坏,那么你的整个磁盘组会被dismount,所有跑在这个磁盘组上的数据库都会crash,你可能还会丢失数据。在这种情况下,你非常需要知道rebalance什么时候完成,实际上,你需要知道第二个阶段extent relocation什么时候完成,一旦它完成了,整个磁盘组的冗余就已经完成了(第三个阶段对于冗余度来说并不重要,后面会介绍)。

Extents relocation

为了进一步观察extents relocation阶段,我删除了具有默认并行度的磁盘组上的一块磁盘:

SQL> show parameter power

NAME TYPE VALUE

------------------------------------ ---------------------- ------------------------------

asm_power_limit integer 1

14:47:35 SQL> select group_number,disk_number,name,state,path,header_status from v$asm_disk where group_number=5;

GROUP_NUMBER DISK_NUMBER NAME STATE PATH HEADER_STATUS

------------ ----------- -------------------- -------------------- -------------------- --------------------

5 0 TESTDG_0000 NORMAL /dev/raw/raw7 MEMBER

5 2 TESTDG_0002 NORMAL /dev/raw/raw13 MEMBER

5 1 TESTDG_0001 NORMAL /dev/raw/raw12 MEMBER

5 3 TESTDG_0003 NORMAL /dev/raw/raw14 MEMBER

14:48:38 SQL> alter diskgroup testdg drop disk TESTDG_0000;

Diskgroup altered.

下面视图GV$ASMOPERATION的ESTMINUTES字段给出了估算值的时间,单位为分钟,这里给出的估算时间为9分钟。

14:49:04 SQL> select inst_id, operation, state, power, sofar, est_work, est_rate, est_minutes from gv$asm_operation where group_number=5;

INST_ID OPERATION STATE POWER SOFAR EST_WORK EST_RATE EST_MINUTES

---------- -------------------- -------------------- ---------- ---------- ---------- ---------- -----------

1 REBAL RUN 1 4 4748 475 9

大约过了1分钟后,EST_MINUTES的值变为了0分钟:

14:50:22 SQL> select inst_id, operation, state, power, sofar, est_work, est_rate, est_minutes from gv$asm_operation where group_number=5;

INST_ID OPERATION STATE POWER SOFAR EST_WORK EST_RATE EST_MINUTES

---------- -------------------- -------------------- ---------- ---------- ---------- ---------- -----------

1 REBAL RUN 1 3030 4748 2429 0

有些时候EST_MINUTES的值可能并不能给你太多的证据,我们还可以看到SOFAR(截止目前移动的UA数)的值一直在增加,恩,不错,这是一个很好的一个观察指标。ASM的alert日志中也显示了删除磁盘的操作,以及OS ARB0进程的ID,ASM用它用来做所有的rebalance工作。更重要的,整个过程之中,没有任何的错误输出:

SQL> alter diskgroup testdg drop disk TESTDG_0000

NOTE: GroupBlock outside rolling migration privileged region

NOTE: requesting all-instance membership refresh for group=5

Tue Jan 10 14:49:01 2017

GMON updating for reconfiguration, group 5 at 222 for pid 42, osid 6197

NOTE: group 5 PST updated.

Tue Jan 10 14:49:01 2017

NOTE: membership refresh pending for group 5/0x97f863e8 (TESTDG)

GMON querying group 5 at 223 for pid 18, osid 5012

SUCCESS: refreshed membership for 5/0x97f863e8 (TESTDG)

NOTE: starting rebalance of group 5/0x97f863e8 (TESTDG) at power 1

Starting background process ARB0

SUCCESS: alter diskgroup testdg drop disk TESTDG_0000

Tue Jan 10 14:49:04 2017

ARB0 started with pid=39, OS id=25416

NOTE: assigning ARB0 to group 5/0x97f863e8 (TESTDG) with 1 parallel I/O

cellip.ora not found.

NOTE: F1X0 copy 1 relocating from 0:2 to 2:2 for diskgroup 5 (TESTDG)

NOTE: F1X0 copy 3 relocating from 2:2 to 3:2599 for diskgroup 5 (TESTDG)

Tue Jan 10 14:49:13 2017

NOTE: Attempting voting file refresh on diskgroup TESTDG

NOTE: Refresh completed on diskgroup TESTDG. No voting file found.

Tue Jan 10 14:51:05 2017

NOTE: stopping process ARB0

SUCCESS: rebalance completed for group 5/0x97f863e8 (TESTDG)

Tue Jan 10 14:51:07 2017

NOTE: GroupBlock outside rolling migration privileged region

NOTE: requesting all-instance membership refresh for group=5

Tue Jan 10 14:51:10 2017

GMON updating for reconfiguration, group 5 at 224 for pid 39, osid 25633

NOTE: group 5 PST updated.

SUCCESS: grp 5 disk TESTDG_0000 emptied

NOTE: erasing header on grp 5 disk TESTDG_0000

NOTE: process _x000_+asm1 (25633) initiating offline of disk 0.3915944675 (TESTDG_0000) with mask 0x7e in group 5

NOTE: initiating PST update: grp = 5, dsk = 0/0xe96892e3, mask = 0x6a, op = clear

GMON updating disk modes for group 5 at 225 for pid 39, osid 25633

NOTE: group TESTDG: updated PST location: disk 0001 (PST copy 0)

NOTE: group TESTDG: updated PST location: disk 0002 (PST copy 1)

NOTE: group TESTDG: updated PST location: disk 0003 (PST copy 2)

NOTE: PST update grp = 5 completed successfully

NOTE: initiating PST update: grp = 5, dsk = 0/0xe96892e3, mask = 0x7e, op = clear

GMON updating disk modes for group 5 at 226 for pid 39, osid 25633

NOTE: cache closing disk 0 of grp 5: TESTDG_0000

NOTE: PST update grp = 5 completed successfully

GMON updating for reconfiguration, group 5 at 227 for pid 39, osid 25633

NOTE: cache closing disk 0 of grp 5: (not open) TESTDG_0000

NOTE: group 5 PST updated.

NOTE: membership refresh pending for group 5/0x97f863e8 (TESTDG)

GMON querying group 5 at 228 for pid 18, osid 5012

GMON querying group 5 at 229 for pid 18, osid 5012

NOTE: Disk TESTDG_0000 in mode 0x0 marked for de-assignment

SUCCESS: refreshed membership for 5/0x97f863e8 (TESTDG)

Tue Jan 10 14:51:16 2017

NOTE: Attempting voting file refresh on diskgroup TESTDG

NOTE: Refresh completed on diskgroup TESTDG. No voting file found.

因此ASM预估了9分钟的时间来完成rebalance,但实际上只使用了2分钟的时候,因此首先能知道rebalance正在做什么非常重要,然后才能知道rebalance什么时候能完成。注意,估算的时间是动态变化的,可能会增加或减少,这个依赖你的系统负载变化,以及你的rebalance的power值的设置,对于一个非常大容量的磁盘组来说,可能rebalance会花费你数小时甚至是数天的时间。

ARB0进程的跟踪文件也显示了,当前正在对哪一个ASM文件的extent的在进行重分配,也是通过这个跟踪文件,我们可以知道ARB0确实是在干着自己的本职工作,没有偷懒。

[grid@jyrac1 trace]$ tail -f +ASM1_arb0_25416.trc

*** 2017-01-10 14:49:20.160

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:49:24.081

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:49:28.290

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:49:32.108

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:49:35.419

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:49:38.921

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:49:43.613

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:49:47.523

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:49:51.073

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:49:54.545

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:49:58.538

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:02.944

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:06.428

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:10.035

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:13.507

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:17.526

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:21.692

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:25.649

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:29.360

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:33.233

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:37.287

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:40.843

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:44.356

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:48.158

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:51.854

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:55.568

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:50:59.439

ARB0 relocating file +TESTDG.256.932913341 (120 entries)

*** 2017-01-10 14:51:02.877

ARB0 relocating file +TESTDG.256.932913341 (50 entries)

注意,跟踪目录下的arb0的跟踪文件可能会有很多,因此我们需要知道arb0的OS是进程号,是哪一个arb0在实际做rebalance的工作,这个信息在ASM实例执行rebalance操作的时候,alert文件中会有显示。我们还可以通过操作系统命令pstack来跟踪ARB0进程,查看具体它在做什么,如下,它向我们显示了,ASM正在重分配extent(在堆栈中的关键函数 kfgbRebalExecute - kfdaExecute - kffRelocate):

[root@jyrac1 ~]# pstack 25416

#0 0x0000003aa88005f4 in ?? () from /usr/lib64/libaio.so.1

#1 0x0000000002bb9b11 in skgfrliopo ()

#2 0x0000000002bb9909 in skgfospo ()

#3 0x00000000086c595f in skgfrwat ()

#4 0x00000000085a4f79 in ksfdwtio ()

#5 0x000000000220b2a3 in ksfdwat_internal ()

#6 0x0000000003ee7f33 in kfk_reap_ufs_async_io ()

#7 0x0000000003ee7e7b in kfk_reap_ios_from_subsys ()

#8 0x0000000000aea0ac in kfk_reap_ios ()

#9 0x0000000003ee749e in kfk_io1 ()

#10 0x0000000003ee7044 in kfkRequest ()

#11 0x0000000003eed84a in kfk_transitIO ()

#12 0x0000000003e40e7a in kffRelocateWait ()

#13 0x0000000003e67d12 in kffRelocate ()

#14 0x0000000003ddd3fb in kfdaExecute ()

#15 0x0000000003ec075b in kfgbRebalExecute ()

#16 0x0000000003ead530 in kfgbDriver ()

#17 0x00000000021b37df in ksbabs ()

#18 0x0000000003ec4768 in kfgbRun ()

#19 0x00000000021b8553 in ksbrdp ()

#20 0x00000000023deff7 in opirip ()

#21 0x00000000016898bd in opidrv ()

#22 0x0000000001c6357f in sou2o ()

#23 0x00000000008523ca in opimai_real ()

#24 0x0000000001c6989d in ssthrdmain ()

#25 0x00000000008522c1 in main ()

Compacting

在下面的例子里,我们来看下rebalance的compacting阶段,我把上面删除的磁盘加回来,同时设置rebalance的power为2:

17:26:48 SQL> alter diskgroup testdg add disk '/dev/raw/raw7' rebalance power 2;

Diskgroup altered.

ASM给出的rebalance的估算时间为6分钟:

16:07:13 SQL> select INST_ID, OPERATION, STATE, POWER, SOFAR, EST_WORK, EST_RATE, EST_MINUTES from GV$ASM_OPERATION where GROUP_NUMBER=1;

INST_ID OPERA STAT POWER SOFAR EST_WORK EST_RATE EST_MINUTES

---------- ----- ---- ---------- ---------- ---------- ---------- -----------

1 REBAL RUN 10 489 53851 7920 6

大约10秒后,EST_MINUTES的值变为0.

16:07:23 SQL> /

INST_ID OPERA STAT POWER SOFAR EST_WORK EST_RATE EST_MINUTES

---------- ----- ---- ---------- ---------- ---------- ---------- -----------

1 REBAL RUN 10 92407 97874 8716 0

这个时候我们在ASM的alert日志中观察到:

SQL> alter diskgroup testdg add disk '/dev/raw/raw7' rebalance power 2

NOTE: GroupBlock outside rolling migration privileged region

NOTE: Assigning number (5,0) to disk (/dev/raw/raw7)

NOTE: requesting all-instance membership refresh for group=5

NOTE: initializing header on grp 5 disk TESTDG_0000

NOTE: requesting all-instance disk validation for group=5

Tue Jan 10 16:07:12 2017

NOTE: skipping rediscovery for group 5/0x97f863e8 (TESTDG) on local instance.

NOTE: requesting all-instance disk validation for group=5

NOTE: skipping rediscovery for group 5/0x97f863e8 (TESTDG) on local instance.

Tue Jan 10 16:07:12 2017

GMON updating for reconfiguration, group 5 at 230 for pid 42, osid 6197

NOTE: group 5 PST updated.

NOTE: initiating PST update: grp = 5

GMON updating group 5 at 231 for pid 42, osid 6197

NOTE: PST update grp = 5 completed successfully

NOTE: membership refresh pending for group 5/0x97f863e8 (TESTDG)

GMON querying group 5 at 232 for pid 18, osid 5012

NOTE: cache opening disk 0 of grp 5: TESTDG_0000 path:/dev/raw/raw7

GMON querying group 5 at 233 for pid 18, osid 5012

SUCCESS: refreshed membership for 5/0x97f863e8 (TESTDG)

NOTE: starting rebalance of group 5/0x97f863e8 (TESTDG) at power 1

SUCCESS: alter diskgroup testdg add disk '/dev/raw/raw7'

Starting background process ARB0

Tue Jan 10 16:07:14 2017

ARB0 started with pid=27, OS id=982

NOTE: assigning ARB0 to group 5/0x97f863e8 (TESTDG) with 1 parallel I/O

cellip.ora not found.

Tue Jan 10 16:07:23 2017

NOTE: Attempting voting file refresh on diskgroup TESTDG

上面的输出意味着ASM已经完成了rebalance的第二个阶段,开始了第三个阶段compacting,如果我说的没错,通过pstack工具可以看到kfdCompact()函数,下面的输出显示,确实如此:

# pstack 982

#0 0x0000003957ccb6ef in poll () from /lib64/libc.so.6

...

#9 0x0000000003d711e0 in kfk_reap_oss_async_io ()

#10 0x0000000003d70c17 in kfk_reap_ios_from_subsys ()

#11 0x0000000000aea50e in kfk_reap_ios ()

#12 0x0000000003d702ae in kfk_io1 ()

#13 0x0000000003d6fe54 in kfkRequest ()

#14 0x0000000003d76540 in kfk_transitIO ()

#15 0x0000000003cd482b in kffRelocateWait ()

#16 0x0000000003cfa190 in kffRelocate ()

#17 0x0000000003c7ba16 in kfdaExecute ()

#18 0x0000000003c4b737 in kfdCompact ()

#19 0x0000000003c4c6d0 in kfdExecute ()

#20 0x0000000003d4bf0e in kfgbRebalExecute ()

#21 0x0000000003d39627 in kfgbDriver ()

#22 0x00000000020e8d23 in ksbabs ()

#23 0x0000000003d4faae in kfgbRun ()

#24 0x00000000020ed95d in ksbrdp ()

#25 0x0000000002322343 in opirip ()

#26 0x0000000001618571 in opidrv ()

#27 0x0000000001c13be7 in sou2o ()

#28 0x000000000083ceba in opimai_real ()

#29 0x0000000001c19b58 in ssthrdmain ()

#30 0x000000000083cda1 in main ()

通过tail命令查看ARB0的跟踪文件,发现relocating正在进行,而且一次只对一个条目进行relocating。(这是正进行到compacting阶段的另一个重要线索):

$ tail -f +ASM1_arb0_25416.trc

ARB0 relocating file +DATA1.321.788357323 (1 entries)

ARB0 relocating file +DATA1.321.788357323 (1 entries)

ARB0 relocating file +DATA1.321.788357323 (1 entries)

...

compacting过程中,V$ASM_OPERATION视图的EST_MINUTES字段会显示为0(也是一个重要线索):

16:08:56 SQL> /

INST_ID OPERA STAT POWER SOFAR EST_WORK EST_RATE EST_MINUTES

---------- ----- ---- ---------- ---------- ---------- ---------- -----------

2 REBAL RUN 10 98271 98305 7919 0

固态表X$KFGMG的REBALST_KFGMG字段会显示为2,代表正在compacting。

16:09:12 SQL> select NUMBER_KFGMG, OP_KFGMG, ACTUAL_KFGMG, REBALST_KFGMG from X$KFGMG;

NUMBER_KFGMG OP_KFGMG ACTUAL_KFGMG REBALST_KFGMG

------------ ---------- ------------ -------------

1 1 10 2

一旦compacting阶段完成,ASM的alert 日志中会显示stopping process ARB0 和rebalance completed:

Tue Jan 10 16:10:19 2017

NOTE: stopping process ARB0

SUCCESS: rebalance completed for group 5/0x97f863e8 (TESTDG)

一旦extents relocation完成,所有的数据就已经满足了冗余度的要求,不再会担心已经失败磁盘的partern磁盘再次失败而出现严重故障。

Changing the power

Rebalance的power可以在磁盘组rebalance过程中动态的更改,如果你认为磁盘组的默认级别太低了,可以去很容易的增加它。但是增加到多少呢?这个需要你根据你系统的IO负载,IO吞吐量来定。一般情况下,你可以先尝试增加到一个保守的值,例如5,过上十分钟看是否有所提升,以及是否影响到了其他业务对IO的使用,如果你的IO性能非常强,那么可以继续增加power的值,但是就我的经验来看,很少能看到power 的设置超过30后还能有较大提升的。测试的关键点在于,你需要在你生产系统的正常负载下去测试,不同的业务压力,不同的存储系统,都可能会让rebalance时间产生较大的差异。


http://www.ngui.cc/zz/2389896.html

相关文章

10.Kafka ---- 重新负载Rebalance过程

1.什么是Rebalance重新负载? Rebalance,即对 Kafka 中的分区进行重新分配的过程。如需详细了解 Kafka 的分区分配策略,请点击链接跳转了解更多:8.Kafka 分区分配策略 2.什么时候触发Rebalance操作 当出现以下几种情况时&#xff…

asm rebalance 三个阶段

The disk group rebalance operation has three phases: Planning -一般30秒内File extents relocation --add 3块2t盘- 1T数据 2小时 drop 1T 1.5小时 add 10块2t盘- 8T数据 8小时 drop 5T 8小时Disk compacting --drop 3块2t盘-1T 40分钟 ,add 8T 4小时GOAL Wha…

rebalance的使用

上篇:project的使用 rebalance the output elements are distributed evenly to instances of the next operation in a round-robin fashion 按照round-robin的方式,决定上游算子的某个并发的数据发往下游的哪个并发。该方法可以保证从上游算子到下游…

Kafka rebalance 重平衡深度解析

文章目录rebalance 触发条件分区分配策略rebalance generation消费者状态机rebalance 协议消费者端 rebalance 流程Broker 端重平衡场景解析新成员入组组成员主动离场组成员崩溃离场重平衡时协调者对组内成员提交位移的处理rebalance 监听器consumer group 是用于实现高伸缩性、…

kafka消费者Rebalance机制

目录 1、Rebalance机制 2、消费者Rebalance分区分配策略 3、Rebalance过程 1、Rebalance机制 rebalance就是说如果消费组里的消费者数量有变化或消费的分区数有变化,kafka会重新分配消费者消费分区的关系。比如consumer group中某个消费者挂了,此时会…

RocketMQ源码(十九)之消费者Rebalance

文章目录版本简介Broker端ConsumerManagerConsumerOffsetManagerSubscriptionGroupManager消费端RebalanceService分配策略版本 基于rocketmq-all-4.3.1版本 简介 集群消息同一个消费组只能有一个消费者消费,如果一个Topic有4个MessageQueue,对于Consu…

oracle rebalance参数,【案例】Oracle ASM扩展新LAN加入asm diskgroup asm rebalance 原理

天萃荷净Oracle研究中心案例分析:运维DBA反映Oracle数据库的ASM空间不足,需要扩展。通过划新的LAN加入asm diskgroup并分析asm rebalance 原理。本站文章除注明转载外,均为本站原创: 转载自love wife & love life —Roger 的O…

HDFS Rebalance 介绍

原文:https://blog.csdn.net/xiaofei0859/article/details/49763705 HDFS中的数据按照一定策略分布在集群中的多个数据节点上,但在某些情况下,数据的分布也会出现不均衡的情况,比如说集群新增加了节点,在新增加的节点上…

oracle rebalance参数,深入内核:Asm Rebalance 原理 SHAPE

深入内核:Asm Rebalance 原理SHAPE \* MERGEFORMAT编辑手记:ASM Rebalance 的过程具体发生了什么操作呢,在不同版本间有什么样的区别,如何才能加快 Rebalance 的速度呢,本文将会解答你的困惑我们先看一个例子某客户进行…

HDFS的Rebalance功能

HDFS中的数据按照一定策略分布在集群中的多个数据节点上,但在某些情况下,数据的分布也会出现不均衡的情况,比如说集群新增加了节点,在新增加的节点上就没有数据存在,虽说之后新增的数据会分配到新节点上,不…