阿里云-云小站(无限量代金券发放中)
【腾讯云】云服务器、云数据库、COS、CDN、短信等热卖云产品特惠抢购

Stream异常导致Oracle不断产生trc文件

119次阅读
没有评论

共计 7183 个字符,预计需要花费 18 分钟才能阅读完成。

某省的生产库,收到告警短信 Oracle 目录 100%,登陆系统查看兼职被报错刷屏了。
Mon Oct 23 23:25:18 EAT 2017
Thread 1 advanced to log sequence 39420 (LGWR switch)
  Current log# 7 seq# 39420 mem# 0: /vgxxx03/oradata/xxx/vg03_1_rd71.log
  Current log# 7 seq# 39420 mem# 1: /vgxxx04/oradata/xxx/vg04_1_rd72.log
Tue Oct 24 01:02:04 EAT 2017
Errors in file /oracle/admin/esshlj/udump/xxx1_ora_11790.trc:
ORA-07445: 出现异常错误: 核心转储 [kadfih()+33] [SIGSEGV] [Address not mapped to object] [0x91A1A1A2A000F] [] []
Tue Oct 24 01:02:06 EAT 2017
Trace dumping is performing id=[cdmp_20171024010206]
Tue Oct 24 01:02:41 EAT 2017
Errors in file /oracle/admin/1_pmon_27125.trc:
ORA-07445: exception encountered: core dump [kadfih()+33] [SIGSEGV] [Address not mapped to object] [0x91A1A1A2A000F] [] []
Tue Oct 24 01:02:43 EAT 2017
Errors in file /oracle/admin/1_pmon_27125.trc:
ORA-00001: unique constraint (108.) violated
ORA-00001: unique constraint (108.) violated
ORA-00001: unique constraint (.) violated
ORA-00602: internal programming exception
ORA-00602: internal programming exception
ORA-00602: internal programming exception
ORA-07445: exception encountered: core dump [kadfih()+33] [SIGSEGV] [Address not mapped to object] [0x91A1A1A2A000F] [] []
Tue Oct 24 01:07:10 EAT 2017
Errors in file /oracle/admin/1_ora_16532.trc:
ORA-00600: 内部错误代码, 参数: [kghfrmrg:prv], [0xC00000073B406C28], [], [], [], [], [], []
Tue Oct 24 01:08:15 EAT 2017
Errors in file /oracle/admin/1_ora_16532.trc:
ORA-00600: 内部错误代码, 参数: [KGHFRE3], [0xC0000007433BACB0], [], [], [], [], [], []
Tue Oct 24 01:10:13 EAT 2017
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Error 22813 occured while spilling buffered messages
Tue Oct 24 01:10:18 EAT 2017
Errors in file /oracle/admin/1_q003_26244.trc:
ORA-00600: internal error code, arguments: [kpotcprc: uga depth exceeded], [16], [], [], [], [], [], []
Error 600 occured while spilling buffered messages
Tue Oct 24 01:10:20 EAT 2017
Errors in file /oracle/admin/1_q003_26244.trc:
ORA-00600: internal error code, arguments: [kpotcprc: uga depth exceeded], [16], [], [], [], [], [], []
Error 600 occured while spilling buffered messages
Tue Oct 24 01:10:23 EAT 2017
Errors in file /oracle/admin/1_q003_26244.trc:
ORA-00600: internal error code, arguments: [kpotcprc: uga depth exceeded], [16], [], [], [], [], [], []
…………………………………………………………………………………………………………………………
Tue Oct 24 15:00:26 EAT 2017
Errors in file /oracle/admin/1_q007_20218.trc:
ORA-00600: internal error code, arguments: [kgqbt_search_block198], [], [], [], [], [], [], []
Tue Oct 24 15:00:26 EAT 2017
Errors in file /oracle/admin/1_q005_20204.trc:
ORA-00600: internal error code, arguments: [kgqbt_search_block198], [], [], [], [], [], [], []
Tue Oct 24 15:00:27 EAT 2017
Errors in file /oracle/admin/1_q000_19435.trc:
ORA-07445: exception encountered: core dump [kocgpn2()+224] [SIGSEGV] [Address not mapped to object] [0x600000002] [] []
Tue Oct 24 15:00:28 EAT 2017
Errors in file /oracle/admin/1_q004_20225.trc:
ORA-00600: internal error code, arguments: [kgqbt_search_block198], [], [], [], [], [], [], []
Tue Oct 24 15:00:28 EAT 2017
Errors in file /oracle/admin/1_q007_20218.trc:
ORA-00600: internal error code, arguments: [kgqbt_search_block198], [], [], [], [], [], [], []
Tue Oct 24 15:00:29 EAT 2017
Errors in file /oracle/admin/1_q005_20234.trc:
ORA-00600: internal error code, arguments: [kgqbt_search_block198], [], [], [], [], [], [], []
Tue Oct 24 15:00:29 EAT 2017
Errors in file /oracle/admin/1_q004_20225.trc:
ORA-00600: internal error code, arguments: [kgqbt_search_block198], [], [], [], [], [], [], []
Tue Oct 24 15:00:30 EAT 2017
Errors in file /oracle/admin/1_q003_19467.trc:
ORA-07445: exception encountered: core dump [kocgpn2()+224] [SIGSEGV] [Address not mapped to object] [0x600000002] [] []

初判应该是 stream 引起的。节点二也有相关 stream 的 cdmp 文件产生
Tue Oct 24 04:56:50 EAT 2017
Trace dumping is performing id=[cdmp_20171024015310]
Streams process APPLY_xxx altered (1) on instance 1
Tue Oct 24 05:11:34 EAT 2017
Trace dumping is performing id=[cdmp_20171024015317]
Streams process APPLY_xxx altered (1) on instance 1
Streams process APPLY_xxx altered (1) on instance 1
Tue Oct 24 05:26:18 EAT 2017
Trace dumping is performing id=[cdmp_20171024015321]

随即给厂家人员打了电话,让他们看看 stream 是否正常,反应状态异常。kill 其进程 p000、p001,立马就停止 apply 进程否则报错!!!

SQL> show user
USER is “STRMADMIN”
SQL> execute dbms_apply_adm.stop_apply (apply_name => ‘apply_esshlj’);

PL/SQL procedure successfully completed.

SQL>  select apply_name,queue_name,status from dba_apply;

APPLY_NAME                    QUEUE_NAME                    STATUS
—————————— —————————— ——–
APPLY_ESSHLJ                  ESSHLJ_QUEUE                  DISABLED  ###这会儿状态为 DISABLE### 启动后状态为 ENABLE。

随即又清理了一下缓存,该在 strmadmin 下执行
declare
  options dbms_aqadm.aq$_purge_options_t;
begin
  options.delivery_mode := dbms_aqadm.buffered;
  dbms_aqadm.purge_queue_table(‘ESSHLJ_QUEUE_TABLE’, null, options);
end;
/
报错了
ERROR at line 1:

ORA-03113: end-of-file on communication channel

1 节点 alert 的 trc 文件一直产生,一秒钟可生成 5 个左右文件,一个 2M。oracle 目录只有 20G。编写脚本自动删除 trc 文件。
1,尝试了 flush buffer_cache;

2,尝试了 flush share_pool;

因为该报错的只是节点一,随即打算将 stream 切换至节点二,观察是否报错依旧,在目标端中心库上面执行了如下命令;
exec DBMS_AQADM.ALTER_QUEUE_TABLE@esshlj(queue_table => ‘ESSHLJ_QUEUE_TABLE’, primary_instance => 2, secondary_instance => 1);

观察 p000 和 p001 进程在 1 节点已经不在了。2 节点 stream 状态正常,ORA-600 报错没了,但是 ORA-7445 报错依旧。因为报错中一直有小工进程 q00 这样的进程,暂时修改 aq_tm_processes 参数为 0,看看报错还有没有。

SQL> !ps -ef |grep q00
  oracle  8955  8883  1 15:22:05 pts/5    0:00 /sbin/sh -c ps -ef |grep q00
  oracle  8957  8955  0 15:22:05 pts/5    0:00 grep q00
  oracle 27097    1  0 14:37:32 ?        0:00 ora_q006_esshlj1
  oracle  8820    1 51 15:21:52 ?        0:00 ora_q003_esshlj1
  oracle  8018    1  0 15:21:41 ?        0:00 ora_q008_esshlj1
  oracle  6752    1  0 15:21:12 ?        0:00 ora_q007_esshlj1
  oracle  7641    1  0 15:21:27 ?        0:00 ora_q004_esshlj1
  oracle  8863    1  1 15:21:58 ?        0:00 ora_q000_esshlj1
  oracle  8839    1  0 15:21:55 ?        0:00 ora_q009_esshlj1
  oracle  8931    1  4 15:22:01 ?        0:00 ora_q001_esshlj1
  oracle  8954    1  6 15:22:04 ?        0:00 ora_q002_esshlj1

SQL> alter system set aq_tm_processes = 0;          ##### 修改参数为 0#######
System altered.

SQL> !ps -ef |grep q00
  oracle 11224  8883  0 15:23:08 pts/5    0:00 /sbin/sh -c ps -ef |grep q00
  oracle 11226 11224  0 15:23:08 pts/5    0:00 grep q00

小工进程已经没有,但是报错依旧~ 头疼,在改回去吧。
SQL> alter system set aq_tm_processes = 1;
System altered.

SQL> !ps -ef |grep q00
  oracle 12198  8883  0 15:23:35 pts/5    0:00 /sbin/sh -c ps -ef |grep q00
  oracle 12203 12198  0 15:23:35 pts/5    0:00 grep q00
  oracle 12135    1 20 15:23:35 ?        0:00 ora_q000_esshlj1
  oracle 12199    1  3 15:23:35 ?        0:00 ora_q002_esshlj1
  oracle 12167    1  7 15:23:35 ?        0:00 ora_q001_esshlj1

有同事大胆猜想既然不在 1 节点跑,是否可以将 1 节点的 aq_tm_processes 参数设置为 0,将 2 节点的 aq_tm_processes 参数设置为 1,并且指定 SID。

SQL>  alter system set aq_tm_processes = 0 sid=’esshlj1′;
System altered.
随后观察 1 节点并没有产生烦人的 7445 告警
Tue Oct 24 15:36:08 EAT 2017
ALTER SYSTEM SET aq_tm_processes=0 SCOPE=BOTH SID=’esshlj1′;
Tue Oct 24 15:45:58 EAT 2017
Trace dumping is performing id=[cdmp_20171024152311]
Tue Oct 24 16:43:44 EAT 2017
Thread 1 advanced to log sequence 39430 (LGWR switch)
  Current log# 1 seq# 39430 mem# 0: /vghlj/oradata/redo01.log
  Current log# 1 seq# 39430 mem# 1: /vghlj/oradata/1_rd11.log

更多 Oracle 相关信息见Oracle 专题��面 http://www.linuxidc.com/topicnews.aspx?tid=12

本文永久更新链接地址:http://www.linuxidc.com/Linux/2017-10/147975.htm

正文完
星哥说事-微信公众号
post-qrcode
 
星锅
版权声明:本站原创文章,由 星锅 2022-01-22发表,共计7183字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
【腾讯云】推广者专属福利,新客户无门槛领取总价值高达2860元代金券,每种代金券限量500张,先到先得。
阿里云-最新活动爆款每日限量供应
评论(没有评论)
验证码
【腾讯云】云服务器、云数据库、COS、CDN、短信等云产品特惠热卖中