阿里云-云小站(无限量代金券发放中)
【腾讯云】云服务器、云数据库、COS、CDN、短信等热卖云产品特惠抢购

Oracle字符集的简单图解,中文乱码解决

139次阅读
没有评论

共计 8200 个字符,预计需要花费 21 分钟才能阅读完成。

经常碰到 SQLPLUS 展现乱码的问题,字符集和相关的定义都有说明但是很少有能把这些关系说的很简单易懂的。

在此之前我们需要搞清楚三个概念,操作系统字符集,客户端字符集,Oracle 字符集:

操作系统字符集:对应的参数是 LANG,这个参数应该是 Oracle 数据库的超集,如果操作系统不支持,那么我们的数据就会乱码。 这里的操作系统指的是客户端的操作系统 。服务器端的操作系统不会影响数据的存取。

数据库字符集:NLS_CHARACTERSET,可以在 nls_database_parameters 中查看当前数据库的字符集,安装数据库的时候选择,一般不修改,不过在新的字符集是现有字符集的严格超集的情况下可以改,其他情况下修改可能导致数据库异常。例如将 UTF8 字符集修改为 AL32UTF8

关于子集超集的映射关系,见如下 Oracle 官网的文档的 Binary Subset-Superset Pairs

http://docs.oracle.com/database/121/NLSPG/applocaledata.htm#NLSPG591 

客户端字符集:对应的参数是 NLS_LANG,如果客户端未设置,此时则取的是安装时数据库的默认参数

为了帮助理解,我画了一张图如下,图中标红部分如果一致表示数据的存储方式一致,即如果 LANG、NLS_LANG、NLS_CHARACTERSET 的编码是一致的如 UTF8,那么数据的传输过程中不会异常,字符乱码只是显示问题。

Oracle 字符集的简单图解,中文乱码解决

1、操作系统字符集

Linux 下首先 locale 查看字符集

[oracle@oddpc ~]$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
[oracle@oddpc ~]$ echo $LANG
en_US.UTF-8

2、该主机并未安装中文支持包,设置 LANG 后可以效果如下,显然无路如何调整 NLS_LANG 在这台机器上都无法展现中文

[oracle@evenpc ~]$ export LANG=zh_CN.utf8
[oracle@evenpc ~]$ date
2016? 10? 13? ??? 15:17:01 CST

3、安装中文支持包,使用 yum -y groupinstall chinese-support 可以安装中文支持包,安装过程略过,安装完毕后可以正常显示中文

[oracle@oddpc ~]$ export LANG=zh_CN.utf8
[oracle@oddpc ~]$ date
2016 年 10 月 13 日 星期四 15:14:19 CST

4、接下来就是展现测试,我安装了两个数据库实例 PROD1 和 PROD5,PROD1 的字符集是 WE8MSWIN1252,PROD5 的字符集是 AL32UTF8

默认情况下 NLS_LANG 是空的,此时 NLS_LANG 取默认安装时的值,PROD1 是 AMRICAN,PROD5 是 SIMPLIFIED CHINESE

[oracle@oddpc ~]$ echo $NLS_LANG
[oracle@oddpc ~]$
SQL> show parameter lang        

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
nls_date_language                    string
nls_language                         string      AMERICAN
SQL> select sysdate from dual;

SYSDATE
---------
13-OCT-16
 

PROD5

SQL> show parameter lang  

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
nls_date_language                    string
nls_language                         string      SIMPLIFIED CHINESE
SQL> select sysdate from dual;

SYSDATE
----------
13-10?-16

5、PROD5 发生乱码,PROD1 英文正常, 设置下 NLS_LANG 参数

PROD1 的结果如下,可以看到提示信息已经变成中文,但是由于字符集非 UTF8 中文字符存入后将乱码

[oracle@oddpc ~]$ export NLS_LANG="SIMPLIFIED CHINESE_CHINA.UTF8"
[oracle@oddpc ~]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.3.0 Production on 星期四 10 月 13 15:42:46 2016

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


连接到: 
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options<pre name="code" class="sql">SQL> show parameter lang

NAME                                 TYPE                              VALUE
------------------------------------ --------------------------------- ------------------------------
nls_date_language                    string                            SIMPLIFIED CHINESE
nls_language                         string                            SIMPLIFIED CHINESE
SQL> show parameter db_name

NAME                                 TYPE                              VALUE
------------------------------------ --------------------------------- ------------------------------
db_name                              string                            PROD1
SQL> show parameter lang

NAME                                 TYPE                              VALUE
------------------------------------ --------------------------------- ------------------------------
nls_date_language                    string                            SIMPLIFIED CHINESE
nls_language                         string                            SIMPLIFIED CHINESE
SQL> select sysdate from dual;

SYSDATE
------------
13-10? -16
SQL> select * from nls_database_parameters;

PARAMETER                                VALUE
---------------------------------------- ----------------------------------------
NLS_LANGUAGE                             AMERICAN
NLS_TERRITORY                            AMERICA
NLS_CURRENCY                             $
NLS_ISO_CURRENCY                         AMERICA
NLS_NUMERIC_CHARACTERS                   .,
NLS_CHARACTERSET                         WE8MSWIN1252
NLS_CALENDAR                             GREGORIAN
NLS_DATE_FORMAT                          DD-MON-RR
NLS_DATE_LANGUAGE                        AMERICAN
NLS_SORT                                 BINARY
NLS_TIME_FORMAT                          HH.MI.SSXFF AM

PARAMETER                                VALUE
---------------------------------------- ----------------------------------------
NLS_TIMESTAMP_FORMAT                     DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT                       HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT                  DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY                        $
NLS_COMP                                 BINARY
NLS_LENGTH_SEMANTICS                     BYTE
NLS_NCHAR_CONV_EXCP                      FALSE
NLS_NCHAR_CHARACTERSET                   AL16UTF16
NLS_RDBMS_VERSION                        11.2.0.3.0

已选择 20 行。
 
 

PROD5 的结果如下,此时 PROD5 显示正常

[oracle@oddpc ~]$ export NLS_LANG="SIMPLIFIED CHINESE_CHINA.UTF8"
[oracle@oddpc ~]$ sqlplus / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on 星期四 10 月 13 15:46:36 2016

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


连接到: 
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
SQL> show parameter db_name

NAME                                 TYPE                              VALUE
------------------------------------ --------------------------------- ------------------------------
db_name                              string                            PROD5
SQL> select sysdate from dual;

SYSDATE
------------
13-10 月 -16

SQL> show parameter lang

NAME                                 TYPE                              VALUE
------------------------------------ --------------------------------- ------------------------------
nls_date_language                    string                            SIMPLIFIED CHINESE
nls_language                         string                            SIMPLIFIED CHINESE<pre name="code" class="sql">SQL> select * from nls_database_parameters;

PARAMETER                                VALUE
---------------------------------------- ----------------------------------------
NLS_LANGUAGE                             AMERICAN
NLS_TERRITORY                            AMERICA
NLS_CURRENCY                             $
NLS_ISO_CURRENCY                         AMERICA
NLS_NUMERIC_CHARACTERS                   .,
NLS_CHARACTERSET                         AL32UTF8
NLS_CALENDAR                             GREGORIAN
NLS_DATE_FORMAT                          DD-MON-RR
NLS_DATE_LANGUAGE                        AMERICAN
NLS_SORT                                 BINARY
NLS_TIME_FORMAT                          HH.MI.SSXFF AM

PARAMETER                                VALUE
---------------------------------------- ----------------------------------------
NLS_TIMESTAMP_FORMAT                     DD-MON-RR HH.MI.SSXFF AM
NLS_TIME_TZ_FORMAT                       HH.MI.SSXFF AM TZR
NLS_TIMESTAMP_TZ_FORMAT                  DD-MON-RR HH.MI.SSXFF AM TZR
NLS_DUAL_CURRENCY                        $
NLS_COMP                                 BINARY
NLS_LENGTH_SEMANTICS                     BYTE
NLS_NCHAR_CONV_EXCP                      FALSE
NLS_NCHAR_CHARACTERSET                   AL16UTF16
NLS_RDBMS_VERSION                        11.2.0.3.0

已选择 20 行。
 

总结:通过以上的实验可以看出,客户端展现是否乱码是由 NLS_LANG 决定,发生中文乱码的情况下,首先查看数据库的 NLS_CHARACTERSET 是否支持中文存储,如果不支持,无论如何设置均无法正常显示中文。Oracle 官方文档上给出了各种语言的编码支持如下。

http://docs.oracle.com/database/121/NLSPG/applocaledata.htm#NLSPG593

Table A-13 Languages and Character Sets Supported by LCSSCAN and GDK

Language Character Sets

Arabic

AL16UTF16, AL32UTF8, AR8ISO8859P6, AR8MSWIN1256, UTF8

Bulgarian

AL16UTF16, AL32UTF8, CL8ISO8859P5, CL8MSWIN1251, UTF8

Catalan

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Croatian

AL16UTF16, AL32UTF8, EE8ISO8859P2, EE8MSWIN1250, UTF8

Czech

AL16UTF16, AL32UTF8, EE8ISO8859P2, EE8MSWIN1250, UTF8

Danish

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Dutch

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

English

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Estonian

AL16UTF16, AL32UTF8, NEE8IOS8859P4, UTF8

Finnish

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

French

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

German

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Greek

AL16UTF16, AL32UTF8, EL8ISO8859P7, EL8MSWIN1253, UTF8

Hebrew

AL16UTF16, AL32UTF8, IW8ISO8859P8, IW8MSWIN1255, UTF8

Hindi

AL16UTF16, AL32UTF8, IN8ISCII, UTF8

Hungarian

AL16UTF16, AL32UTF8, EE8ISO8859P2, EE8MSWIN1250, UTF8

Indonesian

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Italian

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Japanese

AL16UTF16, AL32UTF8, ISO2022-JP, JA16EUC, JA16SJIS, UTF8

Korean

AL16UTF16, AL32UTF8, ISO2022-KR, KO16KSC5601, KO16MSWIN949, UTF8

Latvian

AL16UTF16, AL32UTF8, NEE8ISO8859P4, UTF8

Lithuanian

AL16UTF16, AL32UTF8, NEE8ISO8859P4, UTF8

Malay

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Norwegian

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Persian

AL16UTF16, AL32UTF8, AR8MSWIN1256, UTF8

Polish

AL16UTF16, AL32UTF8, EE8ISO8859P2, EE8MSWIN1250, UTF8

Portuguese

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Romanian

AL16UTF16, AL32UTF8, EE8ISO8859P2, EE8MSWIN1250, UTF8

Russian

AL16UTF16, AL32UTF8, CL8ISO8859P5, CL8KOI8R, CL8MSWIN1251, RU8PC866, UTF8

Serbian

AL16UTF16, AL32UTF8, CL8ISO8859P5, CL8MSWIN1251, UTF8

Simplified Chinese

AL16UTF16, AL32UTF8, HZ-GB-2312, UTF8, ZHS16GBK, ZHS16CGB231280

Slovak

AL16UTF16, AL32UTF8, EE8ISO8859P2, EE8MSWIN1250, UTF8

Slovenian

AL16UTF16, AL32UTF8, EE8ISO8859P2, EE8MSWIN1250, UTF8

Spanish

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Swedish

AL16UTF16, AL32UTF8, US7ASCII, UTF8, WE8ISO8859P1, WE8ISO8859P15, WE8MSWIN1252

Thai

AL16UTF16, AL32UTF8, TH8TISASCII, UTF8

Traditional Chinese

AL16UTF16, AL32UTF8, UTF8, ZHT16MSWIN950

Turkish

AL16UTF16, AL32UTF8, TR8MSWIN1254, UTF8, WE8ISO8859P9

Ukranian

AL16UTF16, AL32UTF8, CL8ISO8859P5, CL8MSWIN1251, UTF8

Vietnamese

AL16UTF16, AL32UTF8, VN8VN3, UTF8

更多 Oracle 相关信息见 Oracle 专题页面 http://www.linuxidc.com/topicnews.aspx?tid=12

本文永久更新链接地址 :http://www.linuxidc.com/Linux/2016-10/136123.htm

正文完
星哥说事-微信公众号
post-qrcode
 0
星锅
版权声明:本站原创文章,由 星锅 于2022-01-22发表,共计8200字。
转载说明:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
【腾讯云】推广者专属福利,新客户无门槛领取总价值高达2860元代金券,每种代金券限量500张,先到先得。
阿里云-最新活动爆款每日限量供应
评论(没有评论)
验证码
【腾讯云】云服务器、云数据库、COS、CDN、短信等云产品特惠热卖中