阿里云-云小站(无限量代金券发放中)
【腾讯云】云服务器、云数据库、COS、CDN、短信等热卖云产品特惠抢购

修改/dev/shm大小造成Oracle 12c集群启动故障解决

364次阅读
没有评论

共计 12317 个字符,预计需要花费 31 分钟才能阅读完成。

由于维护人员修改 Oracle Linux 7 中的 /dev/shm 大小造成其大小小于 Oracle 实例的 MEMORY_TARGET 或者 SGA_TARGET 而导致集群不能启动 (CRS-4535,CRS-4000)
[grid@jtp1 ~]$ crsctl stat res -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.

检查 asm 磁盘的权限是否问题,发现磁盘权限正常
[root@jtp3 ~]# ls -lrt /dev/asm*
brw-rw—-. 1 grid oinstall 8, 128 Apr  3  2018 /dev/asmdisk07
brw-rw—-. 1 grid oinstall 8,  48 Apr  3  2018 /dev/asmdisk02
brw-rw—-. 1 grid oinstall 8,  96 Apr  3  2018 /dev/asmdisk05
brw-rw—-. 1 grid oinstall 8, 112 Apr  3  2018 /dev/asmdisk06
brw-rw—-. 1 grid oinstall 8,  64 Apr  3  2018 /dev/asmdisk03
brw-rw—-. 1 grid oinstall 8,  80 Apr  3  2018 /dev/asmdisk04
brw-rw—-. 1 grid oinstall 8,  32 Apr  3  2018 /dev/asmdisk01

重启 crs
[root@jtp1 bin]# ./crsctl stop crs -f
CRS-2791: Starting shutdown of Oracle High Availability Services-managed resources on ‘jtp1’
CRS-2673: Attempting to stop ‘ora.mdnsd’ on ‘jtp1’
CRS-2673: Attempting to stop ‘ora.gpnpd’ on ‘jtp1’
CRS-2677: Stop of ‘ora.mdnsd’ on ‘jtp1’ succeeded
CRS-2677: Stop of ‘ora.gpnpd’ on ‘jtp1’ succeeded
CRS-2673: Attempting to stop ‘ora.cluster_interconnect.haip’ on ‘jtp1’
CRS-2673: Attempting to stop ‘ora.drivers.acfs’ on ‘jtp1’
CRS-2677: Stop of ‘ora.drivers.acfs’ on ‘jtp1’ succeeded
CRS-2677: Stop of ‘ora.cluster_interconnect.haip’ on ‘jtp1’ succeeded
CRS-2673: Attempting to stop ‘ora.ctssd’ on ‘jtp1’
CRS-2673: Attempting to stop ‘ora.evmd’ on ‘jtp1’
CRS-2677: Stop of ‘ora.ctssd’ on ‘jtp1’ succeeded
CRS-2677: Stop of ‘ora.evmd’ on ‘jtp1’ succeeded
CRS-2673: Attempting to stop ‘ora.cssd’ on ‘jtp1’
CRS-2677: Stop of ‘ora.cssd’ on ‘jtp1’ succeeded
CRS-2673: Attempting to stop ‘ora.gipcd’ on ‘jtp1’
CRS-2673: Attempting to stop ‘ora.driver.afd’ on ‘jtp1’
CRS-2677: Stop of ‘ora.driver.afd’ on ‘jtp1’ succeeded
CRS-2677: Stop of ‘ora.gipcd’ on ‘jtp1’ succeeded
CRS-2793: Shutdown of Oracle High Availability Services-managed resources on ‘jtp1’ has completed
CRS-4133: Oracle High Availability Services has been stopped.
[root@jtp1 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.

查看 crs 的 alert.log 发现磁盘组不能加载
[root@jtp1 ~]# tail -f /u01/app/grid/diag/crs/jtp1/crs/trace/alert.log
2018-04-02 18:30:21.227 [OHASD(8143)]CRS-8500: Oracle Clusterware OHASD process is starting with operating system process ID 8143
2018-04-02 18:30:21.230 [OHASD(8143)]CRS-0714: Oracle Clusterware Release 12.2.0.1.0.
2018-04-02 18:30:21.245 [OHASD(8143)]CRS-2112: The OLR service started on node jtp1.
2018-04-02 18:30:21.262 [OHASD(8143)]CRS-8017: location: /etc/oracle/lastgasp has 2 reboot advisory log files, 0 were announced and 0 errors occurred
2018-04-02 18:30:21.262 [OHASD(8143)]CRS-1301: Oracle High Availability Service started on node jtp1.
2018-04-02 18:30:21.567 [ORAROOTAGENT(8214)]CRS-8500: Oracle Clusterware ORAROOTAGENT process is starting with operating system process ID 8214
2018-04-02 18:30:21.600 [CSSDAGENT(8231)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 8231
2018-04-02 18:30:21.607 [CSSDMONITOR(8241)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 8241
2018-04-02 18:30:21.620 [ORAAGENT(8225)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 8225
2018-04-02 18:30:22.146 [ORAAGENT(8316)]CRS-8500: Oracle Clusterware ORAAGENT process is starting with operating system process ID 8316
2018-04-02 18:30:22.211 [MDNSD(8335)]CRS-8500: Oracle Clusterware MDNSD process is starting with operating system process ID 8335
2018-04-02 18:30:22.215 [EVMD(8337)]CRS-8500: Oracle Clusterware EVMD process is starting with operating system process ID 8337
2018-04-02 18:30:23.259 [GPNPD(8369)]CRS-8500: Oracle Clusterware GPNPD process is starting with operating system process ID 8369
2018-04-02 18:30:24.275 [GPNPD(8369)]CRS-2328: GPNPD started on node jtp1.
2018-04-02 18:30:24.283 [GIPCD(8433)]CRS-8500: Oracle Clusterware GIPCD process is starting with operating system process ID 8433
2018-04-02 18:30:26.296 [CSSDMONITOR(8464)]CRS-8500: Oracle Clusterware CSSDMONITOR process is starting with operating system process ID 8464
2018-04-02 18:30:28.299 [CSSDAGENT(8482)]CRS-8500: Oracle Clusterware CSSDAGENT process is starting with operating system process ID 8482
2018-04-02 18:30:28.496 [OCSSD(8497)]CRS-8500: Oracle Clusterware OCSSD process is starting with operating system process ID 8497
2018-04-02 18:30:29.538 [OCSSD(8497)]CRS-1713: CSSD daemon is started in hub mode
2018-04-02 18:30:36.015 [OCSSD(8497)]CRS-1707: Lease acquisition for node jtp1 number 1 completed
2018-04-02 18:30:37.087 [OCSSD(8497)]CRS-1605: CSSD voting file is online: AFD:CRS1; details in /u01/app/grid/diag/crs/jtp1/crs/trace/ocssd.trc.
2018-04-02 18:30:37.103 [OCSSD(8497)]CRS-1672: The number of voting files currently available 1 has fallen to the minimum number of voting files required 1.
2018-04-02 18:30:46.237 [OCSSD(8497)]CRS-1601: CSSD Reconfiguration complete. Active nodes are jtp1 .
2018-04-02 18:30:48.514 [OCTSSD(9302)]CRS-8500: Oracle Clusterware OCTSSD process is starting with operating system process ID 9302
2018-04-02 18:30:48.535 [OCSSD(8497)]CRS-1720: Cluster Synchronization Services daemon (CSSD) is ready for operation.
2018-04-02 18:30:50.626 [OCTSSD(9302)]CRS-2407: The new Cluster Time Synchronization Service reference node is host jtp1.
2018-04-02 18:30:50.627 [OCTSSD(9302)]CRS-2401: The Cluster Time Synchronization Service started on host jtp1.
2018-04-02 18:31:04.202 [ORAROOTAGENT(8214)]CRS-5019: All OCR locations are on ASM disk groups [CRS], and none of these disk groups are mounted. Details are at “(:CLSN00140:)” in “/u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trc”.
2018-04-02 18:41:00.225 [ORAROOTAGENT(8214)]CRS-5818: Aborted command ‘start’ for resource ‘ora.storage’. Details at (:CRSAGF00113:) {0:9:3} in /u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trc.
2018-04-02 18:41:03.757 [ORAROOTAGENT(8214)]CRS-5017: The resource action “ora.storage start” encountered the following error:
2018-04-02 18:41:03.757+Storage agent start action aborted. For details refer to “(:CLSN00107:)” in “/u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trc”.
2018-04-02 18:41:03.760 [OHASD(8143)]CRS-2757: Command ‘Start’ timed out waiting for response from the resource ‘ora.storage’. Details at (:CRSPE00221:) {0:9:3} in /u01/app/grid/diag/crs/jtp1/crs/trace/ohasd.trc.
2018-04-02 18:42:09.921 [ORAROOTAGENT(8214)]CRS-5019: All OCR locations are on ASM disk groups [CRS], and none of these disk groups are mounted. Details are at “(:CLSN00140:)” in “/u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trc”.

检查跟踪文件,发现查询 ASM_DISCOVERY_ADDRESS 与 ASM_DISCOVERY_ADDRESS 属性时出现
[root@jtp1 ~]# more /u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trc
Trace file /u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root.trc
Oracle Database 12c Clusterware Release 12.2.0.1.0 – Production Copyright 1996, 2016 Oracle. All rights reserved.

*** TRACE CONTINUED FROM FILE /u01/app/grid/diag/crs/jtp1/crs/trace/ohasd_orarootagent_root_93.trc ***

2018-04-02 18:42:09.165 : CSSCLNT:3554666240: clsssterm: terminating context (0x7f03c0229390)
2018-04-02 18:42:09.165 : default:3554666240: clsCredDomClose: Credctx deleted 0x7f03c0459470
2018-04-02 18:42:09.166 :    GPNP:3554666240: clsgpnp_dbmsGetItem_profile: [at clsgpnp_dbms.c:399] Result: (0) CLSGPNP_OK. (:GPNP00401:)got ASM-Profile.Mode=’remote’
2018-04-02 18:42:09.253 : CSSCLNT:3554666240: clsssinit: initialized context: (0x7f03c045c2c0) flags 0x115
2018-04-02 18:42:09.253 : CSSCLNT:3554666240: clsssterm: terminating context (0x7f03c045c2c0)
2018-04-02 18:42:09.254 :  CLSNS:3554666240: clsns_SetTraceLevel:trace level set to 1.
2018-04-02 18:42:09.254 :    GPNP:3554666240: clsgpnp_dbmsGetItem_profile: [at clsgpnp_dbms.c:399] Result: (0) CLSGPNP_OK. (:GPNP00401:)got ASM-Profile.Mode=’remote’
2018-04-02 18:42:09.257 : default:3554666240: Inited LSF context: 0x7f03c04f0420
2018-04-02 18:42:09.260 : CLSCRED:3554666240: clsCredCommonInit: Inited singleton credctx.
2018-04-02 18:42:09.260 : CLSCRED:3554666240: (:CLSCRED0101:)clsCredDomInitRootDom: Using user given storage context for repository access.
2018-04-02 18:42:09.294 : USRTHRD:3554666240: {0:9:3} 8033 Error 4 querying length of attr ASM_DISCOVERY_ADDRESS

2018-04-02 18:42:09.300 : USRTHRD:3554666240: {0:9:3} 8033 Error 4 querying length of attr ASM_DISCOVERY_ADDRESS

2018-04-02 18:42:09.356 : CLSCRED:3554666240: (:CLSCRED1079:)clsCredOcrKeyExists: Obj dom : SYSTEM.credentials.domains.root.ASM.Self.5c82286a084bcf37ffa014144074e5dd.root not found
2018-04-02 18:42:09.356 : USRTHRD:3554666240: {0:9:3} 7755 Error 4 opening dom root in 0x7f03c064c980

检查 ASM 的 alert.log 发现 /dev/shm 大小小于 MEMORY_TARGET 大小,并且给出了 /dev/shm 应该被设置的最小值
[root@jtp1 ~]# tail -f /u01/app/grid/diag/asm/+asm/+ASM1/trace/alert_+ASM1.log
WARNING: ASM does not support ipclw. Switching to skgxp
WARNING: ASM does not support ipclw. Switching to skgxp
WARNING: ASM does not support ipclw. Switching to skgxp
* instance_number obtained from CSS = 1, checking for the existence of node 0…
* node 0 does not exist. instance_number = 1
Starting ORACLE instance (normal) (OS id: 9343)
2018-04-02T18:31:00.187055+08:00
CLI notifier numLatches:7 maxDescs:2301
2018-04-02T18:31:00.193961+08:00
WARNING: You are trying to use the MEMORY_TARGET feature. This feature requires the /dev/shm file system to be mounted for at least 1140850688 bytes. /dev/shm is either not mounted or is mounted with available space less than this size. Please fix this so that MEMORY_TARGET can work as expected. Current available is 1073573888 and used is 167936 bytes. Ensure that the mount point is /dev/shm for this directory.

修改 /dev/shm 的大小可以通过修改 /etc/fstab 来实现,将 /dev/shm 的大小修改为 12G
[root@jtp1 bin]# df -h
Filesystem          Size  Used Avail Use% Mounted on
/dev/mapper/ol-root  49G  42G  7.9G  85% /
devtmpfs              12G  28K  12G  1% /dev
tmpfs                1.0G  164K  1.0G  1% /dev/shm
tmpfs                1.0G  9.3M 1015M  1% /run
tmpfs                1.0G    0  1.0G  0% /sys/fs/cgroup
/dev/sda1          1014M  141M  874M  14% /boot
[root@jtp1 bin]# vi /etc/fstab

#
# /etc/fstab
# Created by anaconda on Sat Mar 18 15:27:13 2017
#
# Accessible filesystems, by reference, are maintained under ‘/dev/disk’
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
/dev/mapper/ol-root    /                      xfs    defaults        0 0
UUID=ca5854cd-0125-4954-a5c4-1ac42c9a0f70 /boot                  xfs    defaults        0 0
/dev/mapper/ol-swap    swap                    swap    defaults        0 0

tmpfs                  /dev/shm                tmpfs  defaults,size=12G        0 0
tmpfs                  /run                    tmpfs  defaults,size=12G        0 0
tmpfs                  /sys/fs/cgroup          tmpfs  defaults,size=12G        0 0

重启集群后,再次检查集群资源状态恢复正常
——————————————————————————–
[grid@jtp1 ~]$ crsctl stat res -t
——————————————————————————–
Name          Target  State        Server                  State details
——————————————————————————–
Local Resources
——————————————————————————–
ora.ASMNET1LSNR_ASM.lsnr
              ONLINE  ONLINE      jtp1                  STABLE
              ONLINE  ONLINE      jtp2                  STABLE
ora.CRS.dg
              ONLINE  ONLINE      jtp1                  STABLE
              ONLINE  ONLINE      jtp2                  STABLE
ora.DATA.dg
              ONLINE  ONLINE      jtp1                  STABLE
              ONLINE  ONLINE      jtp2                  STABLE
ora.FRA.dg
              ONLINE  ONLINE      jtp1                  STABLE
              ONLINE  ONLINE      jtp2                  STABLE
ora.LISTENER.lsnr
              ONLINE  ONLINE      jtp1                  STABLE
              ONLINE  ONLINE      jtp2                  STABLE
ora.TEST.dg
              ONLINE  ONLINE      jtp1                  STABLE
              ONLINE  ONLINE      jtp2                  STABLE
ora.chad
              ONLINE  ONLINE      jtp1                  STABLE
              ONLINE  ONLINE      jtp2                  STABLE
ora.net1.network
              ONLINE  ONLINE      jtp1                  STABLE
              ONLINE  ONLINE      jtp2                  STABLE
ora.ons
              ONLINE  ONLINE      jtp1                  STABLE
              ONLINE  ONLINE      jtp2                  STABLE
ora.proxy_advm
              OFFLINE OFFLINE      jtp1                  STABLE
              OFFLINE OFFLINE      jtp2                  STABLE
——————————————————————————–
Cluster Resources
——————————————————————————–
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE      jtp1                  STABLE
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE      jtp2                  STABLE
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE      jtp2                  STABLE
ora.MGMTLSNR
      1        ONLINE  ONLINE      jtp2                  169.254.237.250 88.8
                                                            8.88.2,STABLE
ora.asm
      1        ONLINE  ONLINE      jtp1                  Started,STABLE
      2        ONLINE  ONLINE      jtp2                  Started,STABLE
      3        OFFLINE OFFLINE                              STABLE
ora.cvu
      1        ONLINE  ONLINE      jtp2                  STABLE
ora.jy.db
      1        ONLINE  OFFLINE                              STABLE
      2        ONLINE  OFFLINE                              STABLE
ora.jtp1.vip
      1        ONLINE  ONLINE      jtp1                  STABLE
ora.jtp2.vip
      1        ONLINE  ONLINE      jtp2                  STABLE
ora.mgmtdb
      1        ONLINE  ONLINE      jtp2                  Open,STABLE
ora.qosmserver
      1        ONLINE  ONLINE      jtp2                  STABLE
ora.scan1.vip
      1        ONLINE  ONLINE      jtp1                  STABLE
ora.scan2.vip
      1        ONLINE  ONLINE      jtp2                  STABLE
ora.scan3.vip
      1        ONLINE  ONLINE      jtp2                  STABLE
——————————————————————————–

到此集群恢复正常

正文完
星哥玩云-微信公众号
post-qrcode
 0
星锅
版权声明:本站原创文章,由 星锅 于2022-01-22发表,共计12317字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
【腾讯云】推广者专属福利,新客户无门槛领取总价值高达2860元代金券,每种代金券限量500张,先到先得。
阿里云-最新活动爆款每日限量供应
评论(没有评论)
验证码
【腾讯云】云服务器、云数据库、COS、CDN、短信等云产品特惠热卖中

星哥玩云

星哥玩云
星哥玩云
分享互联网知识
用户数
4
文章数
19348
评论数
4
阅读量
7801496
文章搜索
热门文章
开发者必备神器:阿里云 Qoder CLI 全面解析与上手指南

开发者必备神器:阿里云 Qoder CLI 全面解析与上手指南

开发者必备神器:阿里云 Qoder CLI 全面解析与上手指南 大家好,我是星哥。之前介绍了腾讯云的 Code...
星哥带你玩飞牛NAS-6:抖音视频同步工具,视频下载自动下载保存

星哥带你玩飞牛NAS-6:抖音视频同步工具,视频下载自动下载保存

星哥带你玩飞牛 NAS-6:抖音视频同步工具,视频下载自动下载保存 前言 各位玩 NAS 的朋友好,我是星哥!...
云服务器部署服务器面板1Panel:小白轻松构建Web服务与面板加固指南

云服务器部署服务器面板1Panel:小白轻松构建Web服务与面板加固指南

云服务器部署服务器面板 1Panel:小白轻松构建 Web 服务与面板加固指南 哈喽,我是星哥,经常有人问我不...
我把用了20年的360安全卫士卸载了

我把用了20年的360安全卫士卸载了

我把用了 20 年的 360 安全卫士卸载了 是的,正如标题你看到的。 原因 偷摸安装自家的软件 莫名其妙安装...
星哥带你玩飞牛NAS-3:安装飞牛NAS后的很有必要的操作

星哥带你玩飞牛NAS-3:安装飞牛NAS后的很有必要的操作

星哥带你玩飞牛 NAS-3:安装飞牛 NAS 后的很有必要的操作 前言 如果你已经有了飞牛 NAS 系统,之前...
阿里云CDN
阿里云CDN-提高用户访问的响应速度和成功率
随机文章
星哥带你玩飞牛NAS-3:安装飞牛NAS后的很有必要的操作

星哥带你玩飞牛NAS-3:安装飞牛NAS后的很有必要的操作

星哥带你玩飞牛 NAS-3:安装飞牛 NAS 后的很有必要的操作 前言 如果你已经有了飞牛 NAS 系统,之前...
手把手教你,购买云服务器并且安装宝塔面板

手把手教你,购买云服务器并且安装宝塔面板

手把手教你,购买云服务器并且安装宝塔面板 前言 大家好,我是星哥。星哥发现很多新手刚接触服务器时,都会被“选购...
一句话生成拓扑图!AI+Draw.io 封神开源组合,工具让你的效率爆炸

一句话生成拓扑图!AI+Draw.io 封神开源组合,工具让你的效率爆炸

一句话生成拓扑图!AI+Draw.io 封神开源组合,工具让你的效率爆炸 前言 作为天天跟架构图、拓扑图死磕的...
亚马逊云崩完,微软云崩!当全球第二大云“摔了一跤”:Azure 宕机背后的配置风险与警示

亚马逊云崩完,微软云崩!当全球第二大云“摔了一跤”:Azure 宕机背后的配置风险与警示

亚马逊云崩完,微软云崩!当全球第二大云“摔了一跤”:Azure 宕机背后的配置风险与警示 首先来回顾一下 10...
12.2K Star 爆火!开源免费的 FileConverter:右键一键搞定音视频 / 图片 / 文档转换,告别多工具切换

12.2K Star 爆火!开源免费的 FileConverter:右键一键搞定音视频 / 图片 / 文档转换,告别多工具切换

12.2K Star 爆火!开源免费的 FileConverter:右键一键搞定音视频 / 图片 / 文档转换...

免费图片视频管理工具让灵感库告别混乱

一言一句话
-「
手气不错
浏览器自动化工具!开源 AI 浏览器助手让你效率翻倍

浏览器自动化工具!开源 AI 浏览器助手让你效率翻倍

浏览器自动化工具!开源 AI 浏览器助手让你效率翻倍 前言 在 AI 自动化快速发展的当下,浏览器早已不再只是...
4盘位、4K输出、J3455、遥控,NAS硬件入门性价比之王

4盘位、4K输出、J3455、遥控,NAS硬件入门性价比之王

  4 盘位、4K 输出、J3455、遥控,NAS 硬件入门性价比之王 开篇 在 NAS 市场中,威...
星哥带你玩飞牛NAS-16:不再错过公众号更新,飞牛NAS搭建RSS

星哥带你玩飞牛NAS-16:不再错过公众号更新,飞牛NAS搭建RSS

  星哥带你玩飞牛 NAS-16:不再错过公众号更新,飞牛 NAS 搭建 RSS 对于经常关注多个微...
零成本上线!用 Hugging Face免费服务器+Docker 快速部署HertzBeat 监控平台

零成本上线!用 Hugging Face免费服务器+Docker 快速部署HertzBeat 监控平台

零成本上线!用 Hugging Face 免费服务器 +Docker 快速部署 HertzBeat 监控平台 ...
国产开源公众号AI知识库 Agent:突破未认证号限制,一键搞定自动回复,重构运营效率

国产开源公众号AI知识库 Agent:突破未认证号限制,一键搞定自动回复,重构运营效率

国产开源公众号 AI 知识库 Agent:突破未认证号限制,一键搞定自动回复,重构运营效率 大家好,我是星哥,...