You are on page 1of 5

https://metalink.oracle.com/CSP/main/article?cmd=show&type=NOT&...

Subject: Changing the Database Character Set ( NLS_CHARACTERSET ) Doc ID: 225912.1 Modified Date: 25-SEP-2009 Type: BULLETIN Status: PUBLISHED

In this Document Purpose Scope and Application Changing the Database Character Set ( NLS_CHARACTERSET ) A) The database character set (NLS_CHARACTERSET) B) Choosing a new database character set C) Changing the database character set C1) Using the "ALTER DATABASE CHARACTER SET" command in 8i or 9i and CSALTER in 10g and up. C2) Using Export/Import (or Datapump in 10g and up). C3) Using a combination of ALTER DATABASE CHARACTER SET (8i, 9i) / CSALTER (10g and up) and export/import D) Further reading References

Applies to:
Oracle Server - Enterprise Edition Information in this document applies to any platform.

Purpose
This article gives a overview of methods to change the database character set. The current NLS_CHARACTERSET is seen in NLS_DATABASE_PARAMETERS.
select value from NLS_DATABASE_PARAMETERS where parameter='NLS_CHARACTERSET';

Use this note to get a basic understanding of the methods, then use the in-depth notes at the end of this note for specific guidance on your conversion. For all questions regarding the NLS_NCHAR_CHARACTERSET read first: Note 276914.1 The National Character Set in Oracle 9i and 10g

Scope and Application


Anyone trying to change the NLS_CHARACTERSET. There are still "dba's" out there who try to change the NLS_CHARACTERSET or NLS_NCHAR_CHARACTERSET by updating props$ . This is NOT supported and WILL corrupt your database. This is one of the best way's to destroy a complete dataset. Oracle Support will TRY to help you out of this but Oracle will NOT warrant that the data can be recoverd or recovered data is correct. You WILL be asked to do a FULL export and a complete rebuild of the database. Please, do NOT update props$.

Changing the Database Character Set ( NLS_CHARACTERSET ) A) The database character set (NLS_CHARACTERSET)
The NLS_CHARACTERSET of a Oracle database defines what characters can be stored in the database using the CHAR, VARCHAR2, LONG and CLOB datatypes. A Characterset does not define languages, it defines a certain range of characters. Any language that uses only the characters known by that characterset can then be stored.

1 sur 5

28/10/2009 18:58

https://metalink.oracle.com/CSP/main/article?cmd=show&type=NOT&...

If you change character sets there is a possibility that characters that you currently use are not defined in the new character set and therefore you could 'corrupt' your data. You should always check this by using the Character Set Scanner (Csscan) before making any changes to your character set. Note 458122.1 Installing and Configuring Csscan in 8i and 9i (Database Character Set Scanner) Note 745809.1 Installing and configuring Csscan in 10g and 11g (Database Character Set Scanner) Note 444701.1 Csscan output explained

B) Choosing a new database character set


For the majority of customers an Unicode character set (AL32UTF8) is the best choice Note 333489.1 Choosing a database character set means choosing Unicode. If you choose any other character set then please be advised of the following: Note 306411.1 Character Set Consolidation for Oracle Database 11gR1 For non-Unicode charactersets the best choice are xx8MSWIN125x charactersets, even if the database itself runs on an Unix platform. The reason is simply that the majority of the clients are windows based systems, hence the best non-Unicode characterset for a database is a characterset that can store all the characters known by those clients, which means an xx8MSWIN125x characterset. Detailed discussion is found in Note 264294.1 Choosing from WE8ISO8859P1, WE8ISO8859P15 or WE8MSWIN1252 as db character set. If you want to know what Languages can be stored in most common charactersets then please see Note 62421.1 Which Character Set Supports Which Language To know what characters are known in a certain characterset then please see Note 282336.1 Charts of most current mono-byte Character sets Or use Locale builder to open the Oracle characterset definition Note 223706.1 Using Locale Builder to view the definition of character sets An excellent external resource is http://www.eki.ee/letter/ . The website allows you to choose a language and then gives a overview of all charactersets that contain all the letters needed for this language. Please note that Oracle does not warrant that the information on this website is accurate.

C) Changing the database character set


There are 2 basic ways of changing the character set and a third 'combined' way:

C1) Using the "ALTER DATABASE CHARACTER SET" command in 8i or 9i and CSALTER in 10g and up.
This is not always possible because seen ALTER DATABASE CHARACTER SET /CSALTER does not (!) change the actual code points of the stored data. So this method can only be used if the data that is currently stored in the database is a Binary Sub set of the new character set (=all codes (!) of the old characterset are valid and mean the same character in the new characterset, the new characterset is then a strict superset of the old) . For 8i / 9i this is documented in Note 66320.1 8i/9i only: Changing the Database Character Set or the Database National Character Set in 8i/9i This can be used for these combinations: Note 119164.1 Changing Database Character Set - Valid Superset Definitions Note that while it's technically not NEEDED to run Csscan in 8i and 9i we strongly recommend to run csscan. In 10g and up the "ALTER DATABASE CHARACTER SET" command is NOT to be used anymore but Csscan/Csalter is the new way to change a database characterset. In 10g you need first to run csscan and then check the csscan results if you can run Csalter. Csalter is not depending on the Superset definitions, it is depending on the csscan output.

2 sur 5

28/10/2009 18:58

https://metalink.oracle.com/CSP/main/article?cmd=show&type=NOT&...

More information about Csscan and Csalter is in Note 745809.1 Installing and configuring Csscan in 10g and 11g (Database Character Set Scanner) Note 444701.1 Csscan output explained

C2) Using Export/Import (or Datapump in 10g and up).


This will always work, you simply export the current database, then create a new database with the new character set and import the data into that database. Of course the characters that you were storing will still have to be defined in the new character set for this to work! See Note 227332.1 NLS considerations in Import/Export - Frequently Asked Questions Even when using full exp/imp we advice to use always Csscan upfront to detect any possible problems. So you can use this for any change where all characters from the old characterset are know in the new (but they may use different codes for the same character). When using datapump there is a chance to have data corruption when going from a 8 bit characterset to UTF8 / AL32UTF8 or an other multibyte characterset on ALL 10g versions (including 10.1.0.5 and 10.2.0.3) and 11.1.0.6. Impdp may provoke data corruption unless you applied Patch 5874989. The "old" exp/imp work fine. This problem is fixed in the 10.2.0.4 and 11.1.0.7 patchset. All existing patches for this bug are found here: http://updates.oracle.com/download/5874989.html For windows the fix is included in 10.1.0.5.0 Patch 20 (10.1.0.5.20P) or later ( Note 276548.1 ) 10.2.0.3.0 Patch 11 (10.2.0.3.11P) or later ( Note 342443.1 ) The patch is technically only needed on the impdp side, but if you use expdp/impdp between different character sets we suggest to patch all your systems.

C3) Using a combination of ALTER DATABASE CHARACTER SET (8i, 9i) / CSALTER (10g and up) and export/import
In some cases method 1 does not work because Csscan tells you that some data needs to be converted to the new character set (= "Convertible" data), and method 2 will simply take too much time. In those cases it is usually possible to use a combination of the 2 methods: a) Export only the "Convertible" data from the tables that are listed by Csscan (= characters where the CODE changes from between the current and the new characterset) b) Truncate or drop those tables. c) Run csscan again to confirm that all data is now ready to be moved to the new character set directly and if that is the case change the character set of the database using the ALTER DATABASE CHARACTERSET (8i,9i) / CSALTER (10g and up) command (method C1). d) Now that the character set has changed we can simply import the data exported in step (a). The import will convert that data so that it gets stored using the correct character codes for this character set. These specific notes can guide you through some often used conversions, they show how to use the the above mentioned "combined method" in practice and document extra checks: Changing from US7ASCII to WE8MSWIN1252 or other xxIOS8859Pxx to xx8MSWIN12xx charactersets Note 555823.1 Changing US7ASCII or WE8ISO8859P1 to WE8MSWIN1252 Note 263119.1 Changing EE8ISO8859P2 to EE8MSWIN1250 Note 260022.1 Changing AR8ISO8859P6 to AR8MSWIN1256 Note 261871.1 Changing EL8ISO8859P7 to EL8MSWIN1253 Note 266309.1 Changing WE8ISO8859P9 to WE8ISO8859P1/WE8MSWIN1252 Note 246008.1 Changing WE8ISO8859P15 to WE8MSWIN1252

3 sur 5

28/10/2009 18:58

https://metalink.oracle.com/CSP/main/article?cmd=show&type=NOT&...

Other combinations Note 265859.1 Changing WE8DEC to WE8ISO8859P1/WE8MSWIN1252 Note 257722.1 Changing WE8ISO8859P1 to WE8ISO8859P15 Note 261639.1 Changing WE8MSWIN1252 to WE8ISO8859P15 Note 273281.1 Changing WE8ISO8859P15 TO WE8ISO8859P1 Changing the NLS_CHARACTERSET to Unicode (= AL32UTF8 or UTF8) Note 788156.1 AL32UTF8 / UTF8 (Unicode) Database Character Set Implications Note 260192.1 Changing the NLS_CHARACTERSET to AL32UTF8 / UTF8 (Unicode) The above 2 notes for going to AL32UTF8 can also be used for going to any other varying width characterset like ZHS16GBK, ZHT16MSWIN950, ZHT16HKSCS, ZHT16HKSCS31,KO16MSWIN949, JA16SJIS ... We strongly suggest however to use AL32UTF8 as NLS_CHARACTERSET, there is no added value in using one of the other varying width charactersets as NLS_CHARACTERSET. Basically AL32UTF8 is "the way forward", AL32UTF8 supports ALL characters defined in any of the other charactersets . Changing from UTF8 to AL32UTF8 or from AL32UTF8 to UTF8 is also simply following Changing the NLS_CHARACTERSET from AL32UTF8 to UTF8 (or from UTF8 to AL32UTF8) is also simply following Note 260192.1 Changing the NLS_CHARACTERSET to AL32UTF8 / UTF8 (Unicode) . This is only needed for Oracle RDBMS Version 7 systems: Note 234381.1 Changing NLS_CHARACTERSET from AL24UTFFSS to UTF8 - AL32UTF8

D) Further reading
There are some additional considerations when you change the character set of an Oracle Applications database, please see the following note for a complete overview of those: Note 124721.1 Migrating an Applications Installation to a New Character Set Note that display problems are most likely *NOT* resolved by start changing the database characterset. Instead please check first of all if you can store/retrieve the data using SQLdeveloper, this is a "know good client" that needs no NLS configuration. You can download it from http://www.oracle.com/technology/products/database/sql_developer/ If the data is displayed correctly in SQLdeveloper then you are sure it's correct in the database and that the current NLS_CHARACTERSET supports the character. If this is the case then see the following notes to correctly configure your other client(s), you can then use the data entered trough SqlDeveloper as "reference": Note 158577.1 Note 179133.1 Note 264157.1 Note 229786.1 NLS_LANG The correct The correct NLS_LANG Explained (How does Client-Server Character Conversion Work?) NLS_LANG in a Windows Environment NLS_LANG on Unix Environments and webservers explained.

A more detailed debugging guide is Note 788931.1 Troubleshooting RDBMS (client and server) NLS Problems (Charactersets, sorts, dates, ..)

References
Note 124721.1 - Migrating an Applications Installation to a New Character Set Note 444701.1 - Csscan output explained Note 458122.1 - Installing and Configuring Csscan in 8i and 9i (Database Character Set Scanner) Note 60134.1 - Globalization (NLS) - Frequently Asked Questions Note 745809.1 - Installing and configuring Csscan in 10g and 11g (Database Character Set Scanner)

Keywords

4 sur 5

28/10/2009 18:58

https://metalink.oracle.com/CSP/main/article?cmd=show&type=NOT&...

AL32UTF8 ; CHARACTERSET ; UNICODE ; NLS_CHARACTERSET ; CSSCAN ; CHANGE ;

Help us improve our service. Please email us your comments for this document. .

5 sur 5

28/10/2009 18:58

You might also like