Netezza SQL Toolkit

NPS SQL Extensions Toolkit Users Guide
Document Number: D20484 Rev. 1 Software Release: 4.5.2 Revised: January 30, 2009
Netezza Corporation Corporate Headquarters 26 Forest St., Marlborough, Massachusetts 01752 tel 508.382.8200 fax 508.382.8300 www.netezza.com
The specifications and information regarding the products described in this manual are subject to change without notice. All statements, information, and recommendations in this manual are believed to be accurate. Netezza makes no representations or warranties of any kind, express or implied, including, without limitation, those of merchantability, fitness for a particular purpose, and noninfringement, regarding this manual or the products' use or performance. In no event will Netezza be liable for indirect, incidental, consequential, special, or economic damages (including lost business profits, business interruption, loss or damage of data, and the like) arising out of the use or inability to use this manual or the products, regardless of the form of action, whether in contract, tort (including negligence), breach of warranty, or otherwise, even if Netezza has been advised of the possibility of such damages. Copyright 2005-2009 Intelligent Integration Systems, Inc. Portions of this publication were derived from PostgreSQL documentation. For those portions of the documentation that were derived originally from PostgreSQL documentation, and only for those portions, the following applies: PostgreSQL is copyright 1996-2001 by the PostgreSQL global development group and is distributed under the terms of the license of the university of california below. Postgres95 is copyright 1994-5 by the Regents of the University of California. Permission to use, copy, modify, and distribute this documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies. In no event shall the University of California be liable to any party for direct, indirect, special, incidental, or consequential damages, including lost profits, arising out of the use of this documentation, even if the University of California has been advised of the possibility of such damage. The University of California specifically disclaims any warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose. The documentation provided hereunder is on an "as-is" basis, and the University of California has no obligations to provide maintenance, support, updates, enhancements, or modifications. Netezza, the Netezza logo, NPS, Snippet, Snippet Processing Unit, SPU, Snippet Processing Array, SPA, Performance Server, Netezza Performance Server, Asymmetric Massively Parallel Processing, AMPP, Intelligent Query Streaming, SQL-Blast and other marks are trademarks or registered trademarks of Netezza Corporation in the United States and/or other countries. All rights reserved. The Netezza implementation of the ODBC driver is an adaptation of an open source driver, Copyright 2000, 2001, Great Bridge LLC. The source code for this driver and the object code of any Netezza software that links with it are available upon request to source-request@netezza.com. Red Hat is a trademark or registered trademark of Red Hat, Inc. in the United States and/or other countries. Linux is a trademark or registered trademark of Linus Torvalds in the United States and/or other countries. D-CC, D-C++, Diab+, FastJ, pSOS+, SingleStep, Tornado, VxWorks, Wind River, and the Wind River logo are trademarks, registered trademarks, or service marks of Wind River Systems, Inc. Tornado patent pending. APC and the APC logo are trademarks or registered trademarks of American Power Conversion Corporation. All document files and software of the above named third-party suppliers are provided "as is" and may contain deficiencies. Netezza and its suppliers disclaim all warranties of any kind, express or implied, including, without limitation, those of merchantability, fitness for a particular purpose, and noninfringement. In no event will Netezza or its suppliers be liable for indirect, incidental, consequential, special, or economic damages (including lost business profits, business interruption, loss or damage of data, and the like), or the use or inability to use the above-named third-party products, even if Netezza or its suppliers have been advised of the possibility of such damages. All other trademarks mentioned in this document are the property of their respective owners. Document Number: 20484 Software Release Number: 4.5.2 NPS SQL Extensions Toolkit Users Guide Copyright 2009 Netezza Corporation. All rights reserved. Regulatory Notices Install the NPS 8000 Series in a restricted-access location. Ensure that only those trained to operate or service the equipment have physical access to it. Install each AC power outlet near the NPS rack that plugs into it, and keep it freely accessible. You must provide all disconnect devices and over-current protection devices. Product may be powered by redundant power sources. Disconnect ALL power sources before servicing. FCC Statement This equipment has been tested and found to comply with the limits for a Class A digital device, pursuant to part 15 of the FCC rules. These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment. This equipment generates, uses, and can radiate radio-frequency energy and, if not installed and used in accordance with the instruction manual, may cause harmful interference to radio communications. Operation of this equipment in a residential area is likely to cause harmful interference, in which case users will be required to correct the interference at their own expense. CSA Statement This Class A digital apparatus meets all requirements of the Canadian Interference-Causing Equipment Regulations (ICES-003). Cet appareil numrique de la classe A est conforme la norme NMB-003 du Canada. CE Statement (Europe) This product complies with the European Low Voltage Directive 73/23/EEC and EMC Directive 89/336/EEC as amended by European Directive 93/68/EEC/. Warning: This is a class A product. In a domestic environment this product may cause radio interference in which case the user may be required to take adequate measures.
Contents
Preface 1 Installation and Setup
Licensing Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 NPS Administration Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 NPS System Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1 Installing the Netezza SQL Extensions Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 Enabling SQL Functions Support in a Database . . . . . . . . . . . . . . . . . . . . . . . . . 1-2 User Account Permissions and Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 Displaying the SQL Extensions Toolkit Version . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 Upgrading the SQL Extensions Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4 Disabling the SQL Extensions Toolkit in a Database . . . . . . . . . . . . . . . . . . . . . . 1-4 Removing the SQL Extensions Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5 Using Different Versions of the SQL Extensions Toolkit . . . . . . . . . . . . . . . . . . . . 1-5 Best Practices for Upgrading NPS Systems with the SQL Extensions Toolkit . . . . . 1-5 Best Practices for Backups and Restores of the NPS Data . . . . . . . . . . . . . . . . . . 1-6 Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
2 XML Data
User Type XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Referencing Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Getting Started: Publishing SQL Data as XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2 Using XPath Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7 XML Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8 IsValidXML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8 IsXML. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-8 XMLAGG. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9 XMLAttributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10 XMLConcat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10 XMLElement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11 XMLExistsNode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11 XMLExtract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12 XMLExtractValue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12 XMLParse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
iii
XMLRoot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14 XMLSerialize. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14 XMLUpdate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
3 Data Transformation
Data Transformation Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 compress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1 decompress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 encrypt/decrypt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2 uuencode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3 uudecode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
4 Hashing
Hash Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2 hash. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2 hash4. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3 hash8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
5 Date and Time Comparisons

Date and Time Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1 day . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-1 days_between . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2 hour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2 hours_between . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2 minute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3 minutes_between . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3 month . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3 next_month. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4 next_quarter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4 next_year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4 second . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5 seconds_between . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5 this_month . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5 this_quarter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6 this_week . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6 this_year. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6 weeks_between . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7 year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-7
iv
6 Text Analytics
Word Comparison Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 word_diff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1 word_find . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2 word_key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3 word_key_tochar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4 word_keys_diff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5 word_stem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6 Regular Expression Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6 The Flags Argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-6 regexp_extract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7 regexp_extract_all . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-7 regexp_extract_all_sp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8 regexp_extract_sp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-8 regexp_instr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-9 regexp_like . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10 regexp_match_count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10 regexp_replace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11 regexp_replace_sp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-11
7 Text Utility
Text Utility Function Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1 hextoraw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1 rawtohex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-1 replace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2 strleft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-2 strright . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3
8 Array
Array Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1 add_element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1 array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2 array_combine. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3 array_concat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3 array_count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3 array_split. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4 array_type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-4
delete_element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-5 element_name . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-5 get_value_type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-5 replace_element . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-6
9 Collection
User Type Collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1 Collection Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1 collection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1 element_type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2
10 Miscellaneous
Miscellaneous Function Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1 greatest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1 least. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2 mt_random . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-2 corr . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3 covar_pop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3 covar_samp. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3
Index
vi
List of Tables
Table 1-1: Table 3-1: Table 3-2: Table 4-1: Table 6-1: Table 6-2: Table 8-1: Known Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6 Uuencoding, Part I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 Uuencoding, Part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4 Algorithms Supported for Cryptographic Hashing . . . . . . . . . . . . . . . 4-2 Algorithms Supported for Phonetic Encoding . . . . . . . . . . . . . . . . . . 6-4 Flags used in Regular Expressions Functions . . . . . . . . . . . . . . . . . . 6-6 Array Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-2
vii
viii
Preface
This document describes the SQL Extension Toolkit for the Netezza platform. The Netezza SQL Extensions Toolkit was developed by NDN innovator, Intelligent Integration Systems, Inc.
Audience
This guide is intended for users who require the additional capabilities provided by the SQL Extension functions, which enable users to manipulate SQL data in more sophisticated ways. Users should be familiar with the basic operation and concepts of the NPS system. Users should also be familiar with C style function declarations, as the API defined in this document uses C style declarations rather than SQL style declarations.
About This Guide

The guide contains the following chapters.
Topics
See
System prerequisites, installation, version Installation and Setup on page 1-1 information, upgrading, disabling, and removing the toolkit, using different toolkit versions, backups, and restores. Importing and storing XML data in a SQL database, manipulating XML within the database, and publishing both XML and conventional SQL data in XML form. XML Data on page 2-1
Data Transformation on page 3-1 Transforming data by compressing, encrypting, or uuencoding, and restoring to the original form using decompress, decrypt, and uudecode. Using hash functions for cryptography, checksums, and lookups. Using date and time functions to compare values of type date or of type timestamp. Performing fuzzy comparisons (approximately matching a search key) and using regular expressions to match precise patterns of characters. Converting between ASCII hexadecimal and ASCII, substituting strings, and extracting strings. Hashing on page 4-1 Date and Time Comparisons on page 5-1 Text Analytics on page 6-1
Text Utility on page 7-1
ix
Topics Creating, combining and splitting arrays, and retrieving, deleting, replacing and counting array elements.
See Array on page 8-1
Grouping heterogeneous pieces of data, i.e. Collection on page 9-1 data of different types. Determining the greatest/least value, corre- Miscellaneous on page 10-1 lation coefficient, covariance, and generating random numbers.
Symbols and Conventions

This guide uses the following typographical conventions: Numbered steps for procedures Bulleted lists for topics Italics for terms, and user-defined variables such as file names bold for command line input and system output examples
If You Need Help

If you are having trouble using the Netezza Performance Server, you should: 1. Retry the action, carefully following the instructions given for that task in the documentation. 2. Go to the Netezza Support Web page at https://support.netezza.com. Select Login to Customer Support Center and enter your support username and password. Click the Knowledge tab to search the knowledgebase solutions, or click the Service Desk tab to submit a support request. 3. If you are unable to access the Support Web site, you can also contact Netezza Support at the following telephone numbers: North American Toll-Free: +1.877.810.4441 United Kingdom Free-Phone: +0.800.032.8382 International Direct: +1.508.620.2281 For a description of the Netezza Support plans, refer to http://www.netezza.com/support/offerings.cfm. Refer to your Netezza maintenance agreement for details about your support plan choices and coverage.
Netezza Welcomes Your Comments

Let us know what you like or dislike about our manuals. To help us with future versions of our manuals, we want to know about any corrections or clarifications that you would find useful. Include the following information: The name and version of the manual that you are using Any comments that you have about the manual Your name, address, and phone number Send us an e-mail message at the following address: doc@netezza.com The doc alias is reserved exclusively for reporting errors and omissions in our documentation. We appreciate your suggestions.
xi
xii
CHAPTER 1
Installation and Setup
Whats in this chapter
Licensing Information NPS Administration Information Known Issues
The Netezza SQL Extensions Toolkit is an optional package for Netezza Performance Server (NPS) systems. This toolkit was developed by NDN innovator, Intelligent Integration Systems, Inc. This chapter provides information on installing and configuring the Netezza SQL Extensions Toolkit on an NPS system, as well as special information for managing backups and upgrades.
Licensing Information
Netezza customers can obtain the toolkit from the Netezza FTP server in the Releases area. The software kit is contained in two files, libnetcrypto-version.tar.gz and libnetxml-version.tar.gz, where version indicates the currently released version of the software kit. The software kit contains a readme file, libraries, the object files for the functions, and scripts which ease the process of defining and using the toolkit functions in an NPS database, as well as disabling and removing the functions.
NPS Administration Information

This section describes the system prerequisites and administration information for the Netezza SQL Extensions Toolkit.
NPS System Prerequisites

The Netezza SQL Extensions Toolkit is designed for use on NPS systems that run the NPS Release 4.5.2 and above.
1-1
Installing the Netezza SQL Extensions Toolkit

To install the Netezza SQL Extensions Toolkit, do the following: 1. Log in to the NPS system as the root user. 2. Copy the sqlext.package.tar.z file to a directory on the NPS system such as /home/nz or another location. (You obtain the package from the Netezza FTP site.) 3. Untar the package using the following command:
tar -xzvpf sqlext.package.tar.z
The command extracts two files, libnetcrypto-version.tar.gz and libnetxml-version.tar.gz. 4. Extract the software files and compiled objects in the libnetcrypto-version.tar.gz file:
tar -xzf libnetcrypto-version.tar.gz
The tar command uncompresses and untars the contents to a directory named libnetcrypto/version in the current directory, where version is the version number of the SQL Extensions Toolkit. 5. Extract the software files and compiled objects in the libnetxml-version.tar.gz file:
tar -xzf libnetxml-version.tar.gz
The tar command uncompresses and untars the contents to a directory named libnetxml/version in the current directory, where version is the version number of the SQL Extensions Toolkit.
Enabling SQL Functions Support in a Database

After you untar the SQL Extensions Toolkit files, you can enable SQL Extensions query support by registering the SQL Extensions functions and API. To enable SQL Extensions queries, do the following: 1. Log in to the NPS system as the nz user account. 2. Change to the directory where the first part of the SQL Extensions library files resides, where dir is the directory in which you untarred the files:
cd <dir>/libnetcrypto/version
3. Run the following command and specify the database name where you want to define the SQL Extensions functions and the NPS user account and password who will own the functions:
./install -d <dbname> -u <username> -W <password>
The command could take up to one minute to run. Upon completion, the command displays the message Successfully Installed Crypto Library to <dbname>. Note: If your database name uses spaces or mixed-case letters such as myDatabase, make sure that you specify double-quotation marks around the database name and escape the quotes. For example: ./install -d \"myDatabase\" -u user -W password
1-2
D20484
Rev.1
4. Change to the directory where the second part of the SQL Extensions library files resides, where dir is the directory in which you untarred the files:
cd <dir>/libnetxml/version
5. Run the following command and specify the database name where you want to define the SQL Extensions functions and the NPS user account and password who will own the functions:
./install -d <dbname> -u <username> -W <password>
The command could take up to one minute to run. Upon completion, the command displays the message Successfully Installed XML Library to <dbname>. These commands define the SQL Extensions Functions and register them in the specified database. The NPS user account you specify becomes the owner of the functions. After this procedure, NPS administrators can manage the SQL Extensions functions as objects in the NPS database, and users who have permission to use the SQL Extensions functions can include them in queries. Figure 1-1 shows a sample NzAdmin window for an NPS system that has the SQL Extensions Toolkit.
Figure 1-1: NzAdmin Interface with the Netezza SQL Extensions Toolkit Functions
D20484
Rev.1
1-3
User Account Permissions and Requirements

To run a SQL Extensions query, NPS user accounts must have the execute permission for function and aggregate objects, as well as for the toolkit functions and aggregates that are added to the system. Users who need to modify the functions (such as to replace the object files with new object files) must also have create and alter permission for the function and aggregate objects.
Displaying the SQL Extensions Toolkit Version

To display the version of the XML functions available in the SQL Extensions toolkit, use the following SQL command:
SELECT regexp_Version();
Sample output follows:

REGEXP_VERSION -----------------------------------------------IISI XML/Regular Expression Library Version 1.2 Build () (1 row)
To display the version of the rest of the functions available in the SQL Extensions toolkit, use the following SQL command:
SELECT CRYPTO_VERSION();
Sample output follows:

CRYPTO_VERSION -----------------------------------------------IISI CRYPTO Library Version 1.2 Build () (1 row)
Upgrading the SQL Extensions Toolkit

Update kits or upgrades of the SQL Extensions toolkit may be made available with fixes or enhancements to the functionality. When such kits become available, they will contain instructions for updating or upgrading to the latest software API.
Disabling the SQL Extensions Toolkit in a Database

You can disable the SQL Extensions functions either temporarily (during testing or troubleshooting) or permanently (such as prior to removing the package). To disable support for SQL Extensions queries in a particular database, follow these steps: 1. Log in to the NPS system as the nz user account. 2. Change to the installation location of the XML functions in the toolkit, for example:
cd <install-dir>/libnetxml/version
3. Run the following command and specify the database name, NPS user name, and password for your system:
./install -R -d <dbname> -u <username> -W <password>
The command displays the message Successfully Uninstalled XML Library from <dbname> when it completes.
1-4
D20484
Rev.1
4. Change to the installation location of the rest of the functions in the toolkit, for example:
cd <install-dir>/libnetcrypto/version
5. Run the following command and specify the database name, NPS user name, and password for your system:
./install -R -d <dbname> -u <username> -W <password>
The command displays the message Successfully Uninstalled Crypto Library from <dbname> when it completes. 6. Repeat Steps 2-5 for each database in which you want to disable the SQL Extensions query support. This install command uses the DROP FUNCTION|AGGREGATE commands to drop the SQL Extensions functions that were added by the install script.
Removing the SQL Extensions Toolkit

To remove or uninstall the SQL Extensions toolkit from an NPS system, first follow the procedure in the previous section to disable SQL Extensions support in each database where it is currently enabled. After you disable support for the SQL Extensions functions, you can remove all of the files in the libnetcrypto/version directory and the libnetxml/version directory.
Using Different Versions of the SQL Extensions Toolkit

Since you install the toolkit to a specific database on the NPS system, it is possible to unpack a new or different version of the kit, install it in a different database, and thus use different versions of the API simultaneously on the NPS system. However, this is not a recommended practice for long-term use. If you install a newer version of the toolkit to a different database, such as a test database for testing and comparison purposes, you should eventually update your production databases with the latest toolkit.
Best Practices for Upgrading NPS Systems with the SQL Extensions Toolkit
After you install the Netezza SQL Extensions Toolkit, take special precautions before you patch or upgrade the NPS software on your system. While most patch and service pack updates should not affect the operation of the toolkit functions, it is possible that an upgrade could stop the functions from working. For example, an upgrade from one major release to another could require you to obtain a new toolkit installation package with new function object files. Before you upgrade the NPS software on your system, make sure that you consult with Netezza Support to ensure that the planned upgrade will not affect your toolkit functions. The NPS Release Notes or the service pack readme file identifies any known situations where an update or upgrade can impact the functions.
D20484
Rev.1
1-5
Best Practices for Backups and Restores of the NPS Data

As a best practice, keep a backup copy of the toolkit installation files in a safe location outside of the NPS system. Make sure that you have recent backups of your NPS systems in the event that you need to recover from an accidental change to your data, or to restore NPS services as part of a disaster recovery situation. There are no special requirements or procedures needed to back up the SQL Extensions functions. After you register the toolkit functions on an NPS system, they and their associated object source files are backed up during the normal Netezza nzbackup operations. If you alter a function or an aggregate (perhaps as a result of a new object file with fixes), the next incremental backup also captures the new object files. For a schema-only restore, you can use the nzrestore -allincs argument, which restores the object files from all available backup increments so that any referenced functions will be created and executable following the restore. If you attempt a -schema-only restore on an increment which does not have function object files (because they have not been altered during this time), the restore process creates zero-length placeholder object files for those functions and logs the signatures of the incomplete functions in the restoresvr log file. The resulting functions are defined in the database, but they cannot be executed because their object files have not been restored. You must use CREATE OR REPLACE commands to update the functions or aggregates with their necessary object files.
Known Issues
This release of the Netezza SQL Extensions Toolkit has the following known issues: Table 1-1: Known Issues Reference Issue Description 44849 XMLAgg() can only aggregate VARCHAR columns, not CHAR columns. For example, if emp.name is defined as CHAR(12), the following SELECT will return an error:
SELECT XMLElement ('emp', XMLAgg (XMLElement ('name', name))) from emp; ERROR: 0 : XML: Corrupted XML Block
The workaround is to use rtrim() on the CHAR column, for example:

SELECT XMLElement ('emp', XMLAgg (XMLElement ('name', rtrim (name)))) from emp;
44894
Only arrays of type varchar support replacing elements by name. For example, given an array of integers, attempting to replace the array element named one with the integer 22 returns an error:
SELECT replace_element(myarray,'one',22); ERROR: 16 : Expected string argument
The workaround is to replace the element by index instead. For example:

SELECT replace_element(myarray,1,22);
44384
Arrays of type timetz are not supported.
1-6
D20484
Rev.1
CHAPTER 2
XML Data
User Type XML Referencing Columns Getting Started: Publishing SQL Data as XML Using XPath Expressions XML Function Reference
One of the most intriguing and urgent requirements to arise from the appearance of XML is a well-defined relationship between XML and SQL. Vast quantities of business data are currently stored in SQL database systems and great demand exists for the ability to present that data in XML form to various client applications. (Special Interest Group on Management of Data, ACM) The XML functions provided by Netezza as extensions to the SQL language are modeled after the SQL/XML specification contained in SQL-2003. The SQL/XML specification defines ways of importing and storing XML data in a SQL database, manipulating it within the database, and publishing both XML and conventional SQL data in XML form. Publishing conventional SQL data in XML form enables you to transform the flat (non-hierarchical) result sets of SQL queries into hierarchically structured XML data; one important use of this transformation is to make this data available via web services. The functions used to publish SQL data in XML format are XMLRoot, XMLElement, XMLConcat, XMLAgg, and XMLAttributes. Data that is already stored in the database as XML can be queried, manipulated, and updated using functions such as XMLExistsNode, XMLExtract, XMLExtractValue, and XMLUpdate. Because XML data consists of a tree of nodes, these functions rely on W3C XPath expressions to locate individual XML nodes within the tree. Note: Certain features of the SQL 2003 SQL/XML specification, including the ability to pass column names into functions and the ability to construct sets, are not supported by Netezza user-defined functions (UDFs). For more information on industry standards for SQL extensions, refer to ISO/IEC 9075-14.
2-1
User Type XML

The XML functions in the Netezza SQL Extensions Toolkit rely on the XML data type as defined in the SQL 2003 SQL/XML specification. Because the Netezza database currently does not support user-defined types, the XML type is stored in a varchar field. The maximum size of a varchar field is 64000 bytes. The XML type is a compiled representation of an XML file, usable wherever a SQL data type is allowed. The semantics of operations on values of XML type assumes a tree-based internal representation. An XML value is either the null value, or a collection of nodes that consists of exactly one XML root node and every node that can be reached recursively by traversing the properties of the nodes.
Referencing Columns
The SQL/XML specification supports the ability to pass column names directly into functions. Netezza user-defined functions (UDFs) do not support this ability. Therefore, element names must be explicitly specified as additional parameters, as in the following example:
SELECT XMLElement('Employee', XMLAttributes('EID', a.id), a.name) from employees a;
Getting Started: Publishing SQL Data as XML

This section explains how to use the XMLElement, XMLConcat, XMLAgg, and XMLAttributes functions within a SQL expression to transform the results of a database query into XML. These are often referred to as publishing functions because the goal is to convert data stored in a relational database into XML that can be made available to other applications, for example web services. The main function in this regard is XMLElement, which takes two arguments, the name of the XML element to create and the content of that element. The following select statement (which does not actually query a database) highlights the use of XMLElement:
select XMLElement('Parent', 'Parent Text');
This creates the following XML:

<Parent>Parent Text</Parent>
It is very important to note that the output from the XMLElement function is a value of type XML, which is the Netezza compiled representation of the XML element. So if you typed the preceding select statement, the return would be the type name XML:
XMLELEMENT ----------XML (1 row)
In order to see the actual XML element created by the XMLElement call (<Parent>Parent Text</Parent>), you need to wrap the XMLElement call with XMLSerialize. For example:
select XMLSerialize(XMLElement('Parent', 'Parent Text'));
2-2
D20484
Rev.1
The real power of XMLElement is that the function calls can be nested to produce the hierarchical structure required for XML data. For example:
select XMLElement('Parent', XMLElement('Child', 'Child text'));
This query produces the following XML:

<Parent> <Child>Child text</Child> </Parent>
The publishing functions can be nested as required, up to a limit of 10,000 nested calls. For example:
select XMLElement('Parent', XMLElement('Child', XMLElement('GrandChild', 'Grandchild text')));
This query produces the following XML:

<Parent> <Child> <GrandChild>Grandchild text</GrandChild> </Child> </Parent>
As a more realistic example, suppose there is a DEPARTMENTS table that contains three columns: DEPTNO, DEPTNAME, and DEPTLOC:
DEPTNO DEPTNAME ------ ---------10 20 30 40 MARKETING HR SALES ENGINEERING DEPTLOC --------BOSTON BOSTON NEW YORK NEW YORK
A plain SQL query to list all departments would look like the following:
select * from departments;
But suppose you needed to return all four rows of department data as XML, with one <Dept> node for each department, and each <Dept> node containing three child nodes, <Number>, <Name>, and <Location>, as shown in the following XML document:
<Departments> <Dept> <Number>10</Number> <Name>MARKETING</Name> <Location>BOSTON</Location> </Dept>
D20484
Rev.1
2-3
<Dept> <Number>20</Number> <Name>HR</Name> <Location>BOSTON</Location> </Dept> <Dept> <Number>30</Number> <Name>SALES</Name> <Location>NEW YORK</Location> </Dept> <Dept> <Number>40</Number> <Name>ENGINEERING</Name> <Location>NEW YORK</Location> </Dept> </Departments>
To create this XML document, you would use a SELECT statement modeled after the following:
SELECT XMLElement('Departments', XMLAGG( XMLElement('Dept', XMLConcat( XMLElement('Number', d.deptno), XMLElement('Name', d.deptname), XMLElement('Location', d.deptloc))))) from departments d;
In each of the first two XMLElement calls, the content of the element is created by a nested XML function call. To create a hierarchically structured XML document of parent and child nodes, you nest the XMLElement calls within a SQL statement. So the first XMLElement function in the query creates the top-level <DEPARTMENTS> node:
XMLElement('Departments', XMLAgg (
The XMLAgg call is used for the second argument, indicating that the content for the toplevel <DEPARTMENTS> node is a group of aggregated nodes, which means these nodes will be child nodes of a single parent node. The second XMLElement call Establishes <DEPT> as the name of each child node of the <Departments> parent node, and then relies on the next three embedded XMLElement calls for the contents of each <DEPT> child node
XMLElement('Dept', XMLConcat( XMLElement('Number', d.deptno), XMLElement('Name', d.deptname), XMLElement('Location', d.deptloc)))))
2-4
D20484
Rev.1
These three embedded XMLElement calls create as many <DEPT> child nodes as necessary to wrap the rows of data returned from the Departments table. It is very important to understand the use of the XMLAGG function. This function aggregates child nodes under their parent node, which in the preceding example means that there is a single parent <DEPARTMENTS> node that contains all four <DEPT> nodes; without the XMLAGG call, the XML produced would contain four <DEPARTMENTS> nodes, each of which contained a single <DEPT> node, which would result in an invalid XML document, as shown here:
<Departments> <Dept> <Number>10</Number> <Name>MARKETING</Name> <Location>BOSTON</Location> </Dept> </Departments> <Departments> <Dept> <Number>20</Number> <Name>HR</Name> <Location>BOSTON</Location> </Dept> </Departments> <Departments> <Dept> <Number>30</Number> <Name>SALES</Name> <Location>NEW YORK</Location> </Dept> </Departments> <Departments> <Dept> <Number>40</Number> <Name>ENGINEERING</Name> <Location>NEW YORK</Location> </Dept> </Departments>
This is not valid XML syntax because there are four instances of the <DEPARTMENTS> document element. This demonstrates how important it is to use the IsValidXML function to ensure that the XML you create with the function library can be parsed as XML. Furthermore, if you are using schemas, then you are also responsible for returning well-formed XML (XML that conforms to the structure specified by the schema). As another example, suppose you want to return a list of employees by department, tagged as follows:
D20484
Rev.1
2-5
<EmployeesByDepartment> <Dept DeptNo=10> <Name>ACCOUNTING</Name> <Location>NEW YORK</Location> <Employees> <Employee EmpNo=7782> <Name>CLARK</Name> <Job>MANAGER</Job> <Manager>7839</Manager> <Salary>2450</Salary> </Employee> <Employee EmpNo=7839> <Name>KING</Name> <Job>PRESIDENT</Job> <Salary>5000</Salary> </Employee> ... </Employees> </Dept> ... <EmployeesByDepartment>
To return employees by department, two select statements are required: first create an employee grouping and then group the employees by department:
CREATE temp table emp_grouping AS SELECT deptno, XMLElement ('Employees', XMLAGG ( XMLElement ('Employee', XMLAttributes ('EmpNo', empno), XMLConcat ( xmlelement ('name', name) xmlelement ('job', job) xmlelement ('manager', mgr) xmlelement ('salary', sal) xmlelement ('comm', comm))))) AS xml FROM emp INNER JOIN dept ON emp.deptno = dept.deptno GROUP BY deptno; SELECT XMLElement('EmployeesByDepartment', XMLAGG( XMLElement('Dept', XMLAttributes('DeptNo', deptno), XMLConcat( XMLElement('Name', D.DNAME), XMLElement('Location', D.LOC), emp_grouping.xml)))) FROM dept INNER JOIN emp_grouping ON dept.deptno = emp_grouping.deptno;
2-6
D20484
Rev.1
Using XPath Expressions
Using XPath Expressions

XML documents are organized as a tree, consisting of a root node and descendent child nodes. The function library relies on XPath arguments to navigate within this tree and locate individual XML nodes. The result of an XPath expression can be either a node or a set of element, text, or attribute nodes. For example, the XPath expression /ABC/DEF selects all DEF child nodes under the ABC root node of the XML document. The following table gives an overview of the most common features of XPath syntax. XPath Syntax Usage / The initial forward slash in an XPath expression specifies the root of the tree. Specify an absolute path with an initial slash. For example, /ABC specifies the root nodes child element named ABC. If the initial slash is omitted, the path is relative and the context of the relative path defaults to the root node. Subsequent forward slashes within an XPath expression are used as path separators to identify the child nodes of any given node. For example, /ABC/DEF specifies the DEF element, which is a child of the ABC element, which is a child of the root element. Two forward slashes specify all descendants of the current node. For example, ABC//DEF matches any DEF element under the ABC element. The asterisk is the wildcard character and specifies a match on any child node. For example, /ABC/*/DEF matches any DEF element that is a grandchild of the ABC element. Specifies predicate expressions, such as the binary operators OR, AND, and NOT. For example, /RESIDENTS [AGE=65 and NAME="Jane Doe"]/ ADDRESS selects out the address element of all residents whose age is 65 and whose name is Jane Doe. [ ] is also used to denote an index into a list. For example, /POSTOFFICE/BOX[10] identifies the second box number element under the POSTOFFICE root element. Selects all child nodes of the named node. For example
bookstore selects all the child nodes of the bookstore element. /bookstore selects the root element bookstore. If the path starts with
// *
[]
nodename
a slash ( / ) it always represents an absolute path to an element.

bookstore/book selects all child book elements of bookstore. book selects all book elements in the document. bookstore//book selects all book elements that are descends of
bookstore, no matter where they are under the bookstore element. . .. @ functionname Selects the current node. Selects the parent of the current node. Selects attributes. For example, //@lang selects all attributes that are named lang. XPath supports a set of built-in functions such as substring(), round(), and not(). In addition, user-defined functions can be made available using namespaces.
D20484
Rev.1
2-7
XML Function Reference

This section lists the available XML functions alphabetically.
IsValidXML
Determines whether or not a character string can be parsed as XML.
Description
The IsValidXML function has the following syntax:
boolean = IsValidXML(varchar input);
The input value specifies the character string to analyze.
Returns
The function returns true if the character string input can be parsed as XML; otherwise, the function returns false. For example:
select IsValidXML('<tag1>12</tag1>'); select ISValidXML('<tag1><tag2>');
This first example returns true; the second example returns false.
IsXML
Determines whether the input argument is a compiled Netezza XML document; in other words, whether the input argument is of type XML.
Description
The IsXML function has the following syntax:
bool = IsXML(XML input);
The input value specifies the XML object to analyze.
Returns
The function returns true if the input varchar is a compiled Netezza XML document. Otherwise it returns false. It is important to explicitly check whether the XML you produce by embedding SQLX functions within your SQL is valid XML, since the underlying SQLX engine does not perform any error checking or validation. Note that if you are using schemas, then you are also responsible for returning well-formed XML (meaning that it conforms to the structure specified by the schema). For example:
select IsXML(XMLParse('<tag1>12345</tag1>'));
This example returns true.
2-8
D20484
Rev.1
XMLAGG
This publishing function aggregates the set of XML inputs into a single XML object.
Description
The XMLAGG function has the following syntax:
XML = XMLAGG(Set(XML) inputs);
The inputs value specifies the set of XML inputs to aggregate into a single XML object.
Returns
The function returns a compiled representation (type XML) of a single XML object which has been aggregated from a set of XML inputs. For example:
SELECT XMLElement('Departments', XMLAGG( XMLElement('Dept', XMLConcat( XMLElement('Number', d.deptno), XMLElement('Name', d.deptname), XMLElement('Location', d.deptloc))))) from departments d;
Assuming that the query returns three rows of data, a possible return value might look like this:
<Departments> <Dept> <Number>10</Number> <Name>MARKETING</Name> <Location>BOSTON</Location> </Dept> <Dept> <Number>20</Number> <Name>HR</Name> <Location>BOSTON</Location> </Dept> <Dept> <Number>30</Number> <Name>SALES</Name> <Location>NEW YORK</Location> </Dept> </Departments>
D20484
Rev.1
2-9
XMLAttributes
This publishing function constructs an XML Attribute object. This object is not a valid XML object; rather, it must be assigned as an attribute value of an XMLElement.
Description
The XMLAttributes function has the following syntax:
XML_Attrib = XMLAttributes(varchar name, varchar value);
The name value specifies the name of the XML attribute to construct. The value value specifies the value of the XML attribute to construct.
Returns
The function returns an XML Attribute object. The following example produces an Emp element for each employee, with an ID and name attribute:
SELECT XMLELEMENT ( 'Emp', XMLATTRIBUTES (e.id,e.fname ||' ' || e.fname AS "name")) AS "result" FROM employees e WHERE employee_id > 200;
This query produces an XML result fragment. For example:

<Emp ID="1001" name="John Smith"/> <Emp ID="1206" name="Jane Doe"/>
XMLConcat
This publishing function concatenates two XML objects (either two elements or two attributes) to produce a single XML object.
Description
The XMLExtract function has two forms, one for concatenating elements and another for concatenating attributes:
XML = XMLConcat(XML inputa, XML inputb); XML_Atrrib = XMLConcat(XML_Attrib inputa, XML_Attrib inputb);
The inputa value specifies the first XML object to concatenate. The inputb value specifies the second XML object to concatenate.
Returns
The function returns a compiled representation (type XML) of the concatenated XML input objects as a single XML object. If either of the input XML objects is null, the function returns null. For an example of the use of XMLConcat, see the example for XMLAgg.
2-10
D20484
Rev.1
XMLElement
This publishing function constructs an XML Element. The XMLElement function is typically nested to produce a hierarchically structured XML document.
Description
The XMLElement function has the following syntax:
XML = XMLElement(varchar name, [XML_Attrib attrib,] varchar value);
The name value specifies the name of the enclosing tag for the XML element. If the identifier specified is NULL, then no element is returned. Note that the name cannot be a column name or column reference, a difference from the SQL/XML specification. One or more optional attrib values specify one or more name-value pairs that create attributes for the XML element. The input value specifies the content of the newly constructed XML element.This can be either a scalar value or a nested XMLElement call.
Returns
The function returns a compiled representation (type XML) of an XML element with the specified name, content, and optionally a collection of attributes. It does not create prolog information. For example:
select XMLElement('Parent', XMLElement('Child', 'Child text'));
This example returns:

<Parent><Child>Child text</Child></Parent>
XMLExistsNode
Determines whether using an XPath to traverse the XML input document results in at least a single XML element or text node.
Description
The XMLExistsNode function has the following syntax:
bool = XMLExistsNode(XML input, varchar XPath);
The input value specifies a compiled representation of an XML file. Values can be any builtin SQL type. The XPath value specifies the XPath of the XML node to extract.
Returns
Returns true if the XPath leads to an XML element or text node in the XML input object. Otherwise returns false. For example:
SELECT person FROM MAILINGLIST WHERE existsNode(person,'/MailingList[Occupation=Doctor]') = 1;
This example returns rows from MAILINGLIST only if nodes exist that satisfy the condition. Note: When using the XMLExistsNode() function in a query, it must always be specified in the WHERE clause, not in the SELECT list.
D20484
Rev.1
2-11
XMLExtract
Finds the XML node(s) specified by the XPath expression. The extracted nodes can be elements, attributes, or text nodes. XMLExtract can be used to extract: Numerical values on which function-based indexes can be created to speed up processing. Collection expressions for use in the FROM clause of SQL statements. XML fragments to be combined into a single XML document.
Description
The XMLExtract function has the following syntax:
XML = XMLExtract(XML input, varchar XPath);
The input value specifies the XML file from which to extract the node. The XPath value specifies an XPath query which specifies an XML node within the XML file.
Returns
If more than one item is found by this function, only the first will be returned. If no item is found, null is returned. The following example uses XMLExtract to query the value of the Reference column for orders with SpecialInstructions set to Rush:
SELECT XMLExtract(object_value,'/PurchaseOrder/Reference') "REFERENCE" FROM PURCHASEORDER WHERE XMLExistsNode(object_value,'/ PurchaseOrder[SpecialInstructions=Rush]') = 1;
An example of a possible return value is as follows:

<Reference>JSMITH-20021009123336271PDT</Reference> <Reference>ABELL-20021009123336321PDT</Reference> <Reference>JDOE-20021009123337303PDT</Reference> <Reference>GWASHINGTON-20021009123337123PDT</Reference>
XMLExtractValue
Extract the actual (scalar) value from the XML input object specified by the XPath parameter. The result of the XPath query must be a single node and either an element, a text node, or an attribute. If a specific datatype is desired, XMLExtractValue can be wrapped with a conversion function, for example a function that converts the varchar to a date.
Description
The XMLExtractValue function has the following syntax:
varchar = XMLExtractValue(XML input, varchar XPath);
The input value specifies an XML file. The XPath value specifies the XPath query.
2-12
D20484
Rev.1
Returns
If the result is an element then it must have a single text node as its child; the child node provides the text content for the scalar return value. If the node does not exist, this function returns null. If more than one node is returned by the XPath expression or if the expression points to an element node with anything other than a single text child node, this function returns an error. For example, the following query extracts the scalar value of the Reference column:
SELECT XMLExtractValue(object_value,'/PurchaseOrder/Reference') "REFERENCE" FROM PURCHASEORDER WHERE XMLExistsNode(object_value,'/ PurchaseOrder[SpecialInstructions=Rush]') = 1;
An example of a possible return value is shown below. Note the difference from the return value for the similar example for XMLExtract. In that example, each line of data is wrapped with a <Reference> element. Here, just the scalar value is extracted and returned:
JSMITH-20021009123336271PDT ABELL-20021009123336321PDT JDOE-20021009123337303PDT GWASHINGTON-20021009123337123PDT
XMLParse
Converts a value of type varchar to a value of type XML, which is the Netezza compiled representation of an XML object (stripping white space by default). The inverse function is XMLSerialize. Note: XMLParse is not intended for parsing and loading external data into XML columns. Though it is possible to call XMLParse as a part of an external table load, the resulting XML datatype is stored as a VARCHAR which has a maximum size of 64000 bytes.
Description
The XMLParse function has the following syntax:
XML = XMLParse(varchar input)
The input value specifies a varchar representation of an XML input object.
Returns
The function returns the Netezza compiled representation of an XML object. If the input varchar resolves to null, the function returns null. For example:
select XMLParse('<Parent>Parent Text</Parent>');
This example returns a value of type XML which is the compiled representation of the XML object <Parent>Parent Text</Parent>.
D20484
Rev.1
2-13
XMLRoot
This publishing function creates a new XML value by providing the version and standalone properties in the XML root information (prolog) of the specified value of type XML. This creates the root node if it does not already exist. Typically, this is done to ensure data-model compliance.
Description
The XMLRoot function has the following syntax:
XML = XMLRoot(XML input, float version, bool standalone);
The input value specifies the XML object to update. The version value specifies the version property of the input XML object. The standalone value specifies the standalone property of the input XML object.
Returns
The function returns the updated object. If a prolog already exists, an error is returned. For example:
INSERT INTO employees ( id, xvalue) VALUES (1001, XMLROOT (XMLPARSE ('<Emp> John Smith </Emp>'), '1.0', true)
XMLSerialize
Converts a value of type XML to a value of type varchar. The inverse function is XMLParse.
Description
The XMLSerialize function has the following syntax:
varchar = XMLSerialize(XML input);
The input value specifies a value of type XML, which is the Netezza compiled representation of an XML file. Values can be any built-in SQL type.
Returns
The function returns the varchar representation of the input XML object. For example:
select XMLSerialize(XMLElement('Parent', 'Parent Text'));

<Parent>Parent Text</Parent>
Without the XMLSerialize call, the XMLElement call returns the type name XML:
XMLELEMENT ----------XML (1 row)
2-14
D20484
Rev.1
XMLUpdate
Updates the portion of an XML document (elements, attributes, or nodes) identified by XPath with a new value. The datatypes of the XPath target and the new value must match. XMLUpdate cannot be directly used to insert a new node or delete an existing node, element, or attribute. Instead, you need to update the containing parent element with the new value.
Description
The XMLUpdate function has two forms, one to update the XML document with a scalar (varchar) value and another to update the XML document with an XML document:
XML = XMLUpdate(XML input, varchar XPath, varchar value); XML = XMLUpdate(XML input, varchar XPath, XML value);
The input value specifies an XML document that contains the fragment to be updated. The Xpath value specifies the XPath expression used to locate the XML fragment to update. If Xpath is an XML element, then the corresponding value must be type XML. If Xpath is an attribute or text node, then the value can be any scalar datatype. The value value specifies the new value to assign the XML fragment.
Returns
The function returns an XML document that contains an updated fragment. For example:
update sales_tab set order = XMLUpdate(order, '/order/company/name', XMLParse('<Name>Netezza</Name>')) where sales_person = John Smith
This example updates the company name in order XML documents to Netezza, where the salesperson is John Smith.
D20484
Rev.1
2-15
2-16
D20484
Rev.1
CHAPTER 3
Data Transformation
Data Transformation Function Reference
The functions in this chapter transform data into a different representation, for the purposes of security, space savings, or transmission time savings. The functions in many cases rely on industry-standard algorithms, as noted in the function descriptions. For more information on these algorithms, refer to the publicly available documentation. Note: Compressed and encrypted data exists in a binary format that is not readable. To display this data, it must first be decompressed/decrypted to avoid output alignment problems. If table columns contain compressed or encrypted data, selects on that table need to use the decompress/decrypt functions to process the binary data in those columns properly.

Because compress/decompress, encrypt/decrypt, and uuencode/uudecode are inverse functions, they are listed together, rather than strictly alphabetically, for ease of comparison.
compress
Compresses a varchar using the public source zlib software library. The zlib library uses the DEFLATE compression algorithm, a variation of LZ77 (Lempel-Ziv 1977). Compression is the process of encoding data so that it uses fewer bits. For example, compression replaces instances of contiguous, repeated characters with a single character and a count. Compressed data must be decompressed before it can be used.
Description
The compress function has the following syntax:
varchar = compress(varchar input[, int level]);
The input value specifies the varchar to be compressed. The level value specifies the compression level used. It can be between 0 and 9 with 0 indicating the least compression and 9 indicating the most compression. The default is 6. Increasing the compression level increases the processing time.
3-1
Returns
The function returns the compressed varchar. For example:
select decompress (compress('1234567890'));

1234567890
decompress
Decompresses a previously compressed varchar.
Description
The decompress function has the following syntax:
varchar = decompress(varchar input);
The input value specifies the compressed varchar to be decompressed.
Returns
The function returns the decompressed varchar. For example:
select decompress (compress('1234567890'));

1234567890
encrypt/decrypt
Encrypts or decrypts the input varchar using the supplied key. Encryption is the process of transforming data in order to maintain its secrecy; the data can be read (unencrypted) only if the recipient has the required key. The Netezza implementation uses symmetric encryption, also known as private or secret key encryption, because the same secret key is used to encrypt and to decrypt data. This means that this secret key must be made available on any server that is decrypting previously encrypted data. You can choose which symmetric encryption algorithm the function uses to encrypt/decrypt the data, either AES (Advanced Encryption Standard) or RC4. Private key encryption is more secure than public key encryption because all public key encryption schemes are susceptible to brute force key search attacks. But private key encryption depends on maintaining the secrecy of the key, so you should periodically change the private key and take steps to ensure that it cannot be discovered in use, in storage, or in distribution (see the description of the key argument below for Netezza specific security recommendations). Note: This is field level encryption, not database encryption.
Description
The encrypt function has the following syntax:
varchar = encrypt(varchar text, varchar key [, int algorithm]);
The decrypt function has the following syntax:

varchar = decrypt(varchar text, varchar key [, int algorithm]);
The text value specifies the value to be encrypted/decrypted.
3-2
D20484
Rev.1
The key value specifies the key to use to encrypt/decrypt the value. Care must be taken to secure the key or else the security will be compromised. Keep in mind the architecture of the Netezza system when designing your security system including the following SQL functions are logged in the pg.log file on the Netezza host so executing encrypt(secret_column, my_secret_key) will reveal your key to anyone who can read the pg.log file. ODBC/JDBC conversations are easily captured with any number of diagnostic/hacking tools. If your key is transmitted as part of the SQL, it can be compromised during this process. For these reasons it is recommended that the secret key be stored in a table and passed into the encrypt/decrypt functions through a table join. For example:
SELECT decrypt(a.value, b.key) FROM my_table a, my_keys b WHERE b.key_id = 1;
The algorithm value can be either RC4 or one of the versions of AES, as shown in the following list. RC4, although the most widely-used encryption algorithm (used for example by SSL and WEP), is not cryptographically secure and is vulnerable to attacks. The Advanced Encryption Standard (AES) is the encryption standard adopted by the United States government and is required for all classified information. The three versions of AES differ only in the design and strength of the key lengths. While all three key lengths are sufficient to protect classified information up to the SECRET level, TOP SECRET information requires the use of key lengths 192 or 256. 0 RC4 (default if no algorithm given) 1 AES 128 2 AES 192 3 AES 256
Returns
The function returns an encrypted/decrypted varchar. For example:
Select decrypt (encrypt('123456',100,0),100,0);

123456
uuencode
Encodes a binary value as ASCII using the Unix UUencode format. The encoding translates the binary value into ASCII character codes in the range 32 and above. Uuencoding has historically been used to encode files destined for e-mail transmission. The uudecode function reverses the effect of uuencode, recreating the original binary file exactly. The uuencode algorithm does the following: 1. Divides the binary value into groups of three bytes (24 bits), adding zeroes to the end of the binary value if necessary to create a final group of three bytes.
D20484
Rev.1
3-3
2. Split the 24 bits into four groups of six bits each. This creates four decimal numbers which lie in the range 0 to 63. 3. Add decimal 32 to each number to create ASCII characters in the range 32 (space) to 95 (underscore). Step 1 is illustrated by the following table. Table 3-1: Uuencoding, Part I ASCII Input ASCII Decimal ASCII Binary (8 bit) h 104 01101000 a 97 01100001 t 116 01110100
Steps 2 and 3 are illustrated by the following table. Note the transformation of the three 8 bit ASCII Binary values in the preceding table to the four 6 bit Binary values in the first line of the table: Table 3-2: Uuencoding, Part II 6 Bit Binary Decimal Equivalent Decimal + 32 Uuencoding 011010 26 58 : 000110 6 38 & 000101 5 37 % 110100 52 84 T
Description
The uuencode function has the following syntax:
varchar = uuencode(varchar input);
The input value specifies the binary varchar to be uuencoded.
Returns
The function returns a UUencoded string. For example:
select uuencode ('hat');
The uuencoding for hat is:

:&%T
uudecode
Decodes an ASCII value that was previously encoded using the Unix UUencode format.
Description
The uudecode function has the following syntax:
varchar = uudecode(varchar input);
The input value specifies the string to be uudecoded.
3-4
D20484
Rev.1
Returns
The function returns a UUdecoded string. For example:
select uudecode (':&%T');

hat
D20484
Rev.1
3-5
3-6
D20484
Rev.1
CHAPTER 4
Hashing
Hash Function Reference
Hashing functions are used to encode data, transforming the input into a hash code or hash value. The hash algorithm is designed to minimize the chance that two inputs will have the same hash value, termed a collision. Hashing functions are used to speed up the retrieval of data records (simple one-way lookups), for the validation of data (checksums), and for cryptography. For lookups, the hash code is used as an index into a hash table which contains a pointer to the data record. For checksums, the hash code is computed for the data before storage/transmission and then recomputed afterward to verify data integrity; if the hash codes do not match, the data is corrupt. Cryptographic hash functions are used for data security. Some common use cases for hashing functions include: Detect duplicated records. Because the hash keys of duplicates will hash to the same bucket in the hash table, the task reduces to scanning buckets that have more than two records, a much faster method than sorting and comparing each record in the file. (This same technique can be used to find similar records, because similar keys will hash to buckets that are contiguous, the search for similar records can therefore be limited to those buckets.) Locate points that are near each other. Applying a hashing function to spatial data effectively partitions the space being modeled into a grid, and as in the previous example, the retrieval/comparison time is greatly reduced because only contiguous cells in the grid need to be searched. This same technique works for other types of spatial data, such as shapes and images. Verify message integrity. The hash of message digests is made both before and after transmission and the two hash values compared to determine whether the message was corrupted. Verify passwords. During authentication, the users login credentials are hashed and this value is compared with the hashed password stored for that user.
4-1

The SQL Extensions Toolkit has three hashing functions: hash(), hash4(), and hash8(). The hash() function is a cryptographic function that virtually never produce the same output for two different inputs. However, if speed in hash generation and comparison is required or if all you need is a simple one-way lookup function, use hash4 or hash8 instead.
hash
Returns a 128 bit, 160 bit, or 256 bit hash of the input data, depending on the algorithm selected. This function provides between 2128 and 2256 distinct return values and is intended for cryptographic purposes. hash() is generally much slower to calculate than hash4() or hash8(). The return type is a 16 to 32 byte binary varchar. This can make hash comparisons slower than a simple integer comparison On the Netezza platform, a column of these hashes cannot make use of zone-maps and other performance enhancements.
Description
The hash function has the following syntax:
varchar = hash(varchar data [, int algorithm]);
The data value specifies the varchar to hash. The algorithm value is specified by an integer code (defaults to 0). The available algorithms and the size of the resulting hash value are shown in the following table: Table 4-1: Algorithms Supported for Cryptographic Hashing Code 0 1 2 Description MD5 SHA-1 SHA-2 Result 128 bit 160 bit 256 bit
Both the MD5 and SHA algorithms are message digest algorithms derived from MD4. The SHA (Secure Hash Algorithm) hash functions are the result of an effort by the National Security Agency (NSA) to provide strong cryptographic hashing capabilities. Security flaws have been identified in both SHA-1 and MD5. SHA-2 is still considered secure as of the publication date of this manual, but SHA-3 development is currently underway to prepare for any future security flaw discovered in SHA-2.
Returns
The function returns the hashed input data. For example:
select hash4('Netezza',0);
This example returns 186778338.
4-2
D20484
Rev.1
hash4
Returns the 32 bit checksum hash of the input data. This function provides 232 (approximately 4 billion) distinct return values and is intended for data retrieval (lookups).
Description
The hash4 function has the following syntax:
int4 = hash4(varchar data [, int algorithm]);
The data value specifies the varchar to hash. The algorithm can be one of the following (defaults to Adler): 0 Adler 1 CRC32 Adler is the fastest checksum hash that is provided. However, it has poor coverage when the messages are less than a few hundred bytes (poor coverage means that two different integers hash to the same value, referred to as a collision). In this case, use the CRC32 algorithm, or switch to hash8 instead.
Returns
select hash4('Netezza',0);
hash8
Returns the 64 bit hash of the input data. The function provides 264 distinct return values and is intended for data retrieval (lookups).
Description
The hash8 function has the following syntax:
int8 = hash8(varchar data [, int algorithm]);
The data value specifies the varchar to hash. Only one algorithm value is supported for this hashing function, 0, which indicates the Jenkins algorithm.
Returns
select hash8('Netezza');
D20484
Rev.1
4-3
4-4
D20484
Rev.1
CHAPTER 5
Date and Time Comparisons
Date and Time Function Reference
There are three types associated with the date and time functions date, time, and timestamp. The timestamp type is implicitly converted to date and time and can therefore be passed into any of the date/time functions. The date type is implicitly converted to type timestamp (but not time) and can therefore be supplied to any function that takes either a date or a timestamp. Values of type time cannot be converted into anything and therefore can only be supplied to functions that take this type. For example, although the signature for the next_month function indicates that the function takes an input value of type date, it is permissible to pass an input value of type timestamp into the next_month function.

The functions are organized alphabetically.
day
Determine the weekday in the specified date. Note: These can also be accomplished using the Netezza date_part() function.
Description
The day function has the following syntax:
int1 = day(date input);
The input value specifies the date.
Returns
Returns an integer representation of the day in the specified input. For example:
select day('1996-2-29');
5-1
days_between
Determine the truncated number of full days between two timestamps.
Description
The days_between function has the following syntax:
int = days_between(timestamp t1, timestamp t2);
The t1 value specifies the beginning timestamp. The t2 value specifies the ending timestamp.
Returns
Returns the truncated number of full days between t1 and t2. For example:
select days_between('1996-02-27 06:12:33' , '1996-03-01 07:12:33');
hour
Determine the hours value in the specified time. Note: This can also be accomplished using the Netezza date_part function.
Description
The hour function has the following syntax:
int1 = hour(time input);
The input value specifies the time.
Returns
Returns an integer representation of the hour in the specified time. For example:
select hour ('01:12:55');
hours_between
Determine the truncated number of full hours between two timestamps.
Description
The hours_between function has the following syntax:
int = hours_between(timestamp t1, timestamp t2);
Returns
Returns the truncated number of full hours between t1 and t2. For example:
select hours_between('1996-02-27 06:12:33' , '1996-03-01 07:12:33');
5-2
D20484
Rev.1
minute
Determine the minutes value in the specified time. Note: This can also be accomplished using the Netezza date_part function.
Description
The minute function has the following syntax:
int1 = minute(time input);
Returns
Returns an integer representation of the minute in the specified time. For example:
select minute ('01:12:55');
minutes_between
Determine the truncated number of full minutes between two timestamps.
Description
The minutes_between function has the following syntax:
int = minutes_between(timestamp t1, timestamp t2);
Returns
Returns the truncated number of full minutes between t1 and t2. For example:
select minutes_between('1996-02-27 06:12:33' , '1996-02-27 07:12:00');
month
Determine the month in the specified date. Note: This can also be accomplished using the Netezza date_part function.
Description
The month function has the following syntax:
int1 = month(date input);
Returns
Returns an integer representation of the month in the specified input. For example:
select month('1996-2-29');
D20484
Rev.1
5-3
next_month
Determine the first day of the next month after the specified date.
Description
The next_month function has the following syntax:
date = next_month(date input);
The input value specifies a date.
Returns
Returns a date value representing the first day of the next month after the month specified by the input. For example:
select next_month('1996-2-29');
This example returns 1996-03-01.
next_quarter
Determine the first day of the next quarter after the quarter specified by the input.
Description
The next_quarter function has the following syntax:
date = next_quarter(date input);
Returns
Returns a date value representing the first day of the next quarter after the quarter specified by the input. For example:
select next_quarter('1996-2-29');
next_year
Determine the first day of the next year after the year specified by the input.
Description
The next_year function has the following syntax:
date = next_year(date input);
Returns
Returns a date value representing the first day of the next year after the year specified by the input. For example:
select next_year('1996-2-29');
5-4
D20484
Rev.1
second
Determine the seconds value in the specified time. Note: This can also be accomplished using the Netezza date_part function.
Description
The second function has the following syntax:
int1 = second(time input);
Returns
Returns an integer representation of the seconds value in the specified time. For example:
select second ('01:12:55');
seconds_between
Determine the truncated number of full seconds between two timestamps.
Description
The seconds_between function has the following syntax:
int = seconds_between(timestamp t1, timestamp t2);
Returns
Returns the truncated number of full seconds between t1 and t2. For example:
select seconds_between('1996-02-27 06:12:33','1996-02-27 06:55:22');
this_month
Determine the first day of the month in the specified date. Note: This functionality is also provided by the Netezza date_trunc() function.
Description
The this_month function has the following syntax:
date = this_month(date input);
Returns
Returns a date representing the first day of the month specified by input. For example:
select this_month('1996-2-29');
D20484
Rev.1
5-5
this_quarter
Determine the first day of the quarter in which the specified date occurs.
Description
The this_quarter function has the following syntax:
date = this_quarter(date input);
Returns
Returns a date value representing the first day of the specified quarter. For example:
select this_quarter('1996-2-29');
this_week
Determine the first day of the week in the specified date.
Description
The this_week function has the following syntax:
date = this_week(date input);
Returns
Returns a date value representing the first day of the week specified by input. For example:
select this_week('1996-2-29');
this_year
Determine the first day of the year in the specified date. Note: This functionality is also provided by the Netezza date_trunc() function.
Description
The this_year function has the following syntax:
date = this_year(date input);
Returns
Returns a date value representing the first day of the year specified by input. For example:
select this_year('1996-2-29');
5-6
D20484
Rev.1
weeks_between
Determine the truncated number of full weeks between two timestamps.
Description
The weeks_between function has the following syntax:
int = weeks_between(timestamp t1, timestamp t2);
Returns
Returns the truncated number of full weeks between t1 and t2. For example:
select weeks_between('1996-02-27 06:12:33' , '1996-03-05 07:12:33');
year
Determine the year in the specified date. Note: This can also be accomplished using the Netezza date_part function.
Description
The year function has the following syntax:
int2 = year(date input);
Returns
Returns an integer representation of the year in the specified date. For example:
select day('1996-2-29');
D20484
Rev.1
5-7
5-8
D20484
Rev.1
CHAPTER 6
Text Analytics
Word Comparison Function Reference Regular Expression Function Reference
The functions in this chapter fall into two distinct groupings. The word comparison functions are useful for fuzzy comparisons, finding records in a database that approximately match a search key, phonetically or lexically. The regular expression functions identify precise patterns of characters and are useful for data validation, for example type checks, range checks, and checks for illegal characters.
Word Comparison Function Reference

The functions are listed alphabetically. For those functions that operate only on ASCII characters, you can transliterate the strings to convert any accented characters to their ASCII unaccented versions. For those functions that consider case when evaluating strings, if you want to ignore case, you can use Netezza functions such as upper() and lower() to change the letter casing of strings prior to the comparison. For information on these functions, refer to the Netezza Performance Server Database Users Guide.
word_diff
Finds the number of modifications that are required to change the first string into the second string. Adding, deleting, substituting, or changing the case of a single character in the string each count as one modification. Transposing two adjacent characters counts as two modifications in all but the Damerau-Levenshtein algorithm, which counts transposition as a single modification. Note: Using the word_diff function with the Soundex or Double-Metaphone algorithms achieves the same result as using the combination of the word_key function to convert the strings to their phonetic encodings and then using the word_keys_diff function to compare those encodings. The word_diff function both converts the strings to their phonetic encodings and compares those encodings.
Description
The word_diff function has the following syntax:
int1 = word_diff(varchar word1, varchar word2 [, int algorithm]);
6-1
The word1 value specifies the first word in the comparison. The word2 value specifies the second word in the comparison. Algorithm is one of the following: 0 Soundex-Miracode 1 Soundex-Simplified 2 Soundex-SQLServer 3 Double-Metaphone (default if no algorithm given) 10 Levenshtein 11 Damerau-Levenshtein Note: The built-in Netezza le_dst() function is equivalent to using the word_diff() function with the Levenshtein algorithm. The built-in Netezza dle_dst() function is equivalent to using the word_diff() function, with the Damerau-Levenshtein algorithm.
Returns
Returns an integer that indicates how similar or different the two strings are. A value of 0 indicates the strings are the same. The results vary depending on the algorithm chosen. For example:
select word_diff('anderson','andrsn',0);
This example returns 0, because the Soundex algorithms consider only the initial vowel, not subsequent vowels. Suppose the algorithm is changed to Damerau-Levenshtein, as in the following example:
select word_diff('anderson','andrsn',11);
This call returns 2, because Damerau-Levenshtein accounts for the missing vowels e and o in the second string.
word_find
Searches the input varchar text for the first word that matches the input parameter word within the specified tolerance.
Description
The word_find function has the following syntax:
int4 = word_find(varchar word, varchar text, int1 difference [, int algorithm1 [, int algorithm2 [, int algorithm3]]]);
The word value specifies the word you want to search for in text. The text value specifies the varchar text to search. The difference value specifies the tolerance used by each specified algorithm when searching for a match. Each specified algorithm will be used to try and find a match within the tolerance defined by difference. If no algorithms are specified or if the only algorithm specified is a stemming algorithm then an exact (case-insensitive) match is required. algorithm is one of the following:
6-2
D20484
Rev.1
0 Soundex-Miracode 1 Soundex-Simplified 2 Soundex-SQLServer 3 Double-Metaphone 10 Levenshtein 11 Damerau-Levenshtein 100 Porter
Returns
Returns the position of the first character of the matching string. For example:
select word_find('swimming', 'she swims in the competition in red wsimwear', 0, 11, 100);

select word_find('swimming', 'she swims in the competition in red wsimwear', 1, 11);

select word_find('SwimweaR ', 'she swims in the competition in red wsimwear', 0, 11);
word_key
Phonetically encode a word, according to its pronunciation in English, using the Double Metaphone algorithm or one of the three supported varieties of the Soundex algorithm. The phonetically encoded words can subsequently be compared with the word_keys_diff function for a fuzzy comparison. Words with the same pronunciation but different spellings are encoded the same; depending on the algorithm selected, similar sounding words might also be encoded the same. The goal is to enable you to match names based on their pronunciation and reduce misses that might result from spelling variations. For example, this type of fuzzy comparison can be used to find duplicate records resulting from spelling errors; another use is to find ancestor names in a genealogical database when the spelling has changed slightly over time. The phonetic matching functions are case-insensitive comparisons: the phonetic representations are the same for two strings that have the same spelling but different letter casing. The functions ignore any characters outside the ASCII subset.
Description
The word_key function has the following syntax:
int4 = word_key(varchar word [, int algorithm]);
The input value specifies the varchar word to be given a phonetic encoding.
D20484
Rev.1
6-3
The algorithm value is specified by an integer code (defaults to 3). The available algorithms are listed in the following table: Table 6-1: Algorithms Supported for Phonetic Encoding Co de 0 Name SoundexMiracode Description The original Soundex algorithm used to encode surnames in the United States census between 1880 and 1930. All surnames are encoded as a four-character string: the first character represents the first letter of the persons last name, and characters two, three, and four are integer encodings for the remaining consonants in the name, ignoring vowels, collapsing duplicate encodings to a single value, and right-padding with zeroes if necessary. An updated form of the original Soundex algorithm, it is identical to Miracode except that it does not encode H or W. The version of the Soundex algorithm implemented in Microsoft SQL Server. It does not encode H or W rule and similarity grouping starts after the first character. Encodes most English words, not just names. The algorithm better quantifies the rules of English pronunciation and also recognizes a subset of non-Latin characters, making it a much better choice than Soundex (it is the algorithm used by most spell checkers). Whereas Soundex encodes all names with a key of the same length, Double-Metaphone outputs variable length encodings that more accurately represent the sounds of the word. The algorithm also handles the case in which a word has an alternate pronunciation by returning a primary and a secondary encoding.
1 2
SoundexSimplified SoundexSQLServer
DoubleMetaphone
Note: The Netezza built-in dbl_mp() function is equivalent to using the word_key() function with the Double Metaphone algorithm. The Netezza built-in nysiis() function is roughly equivalent to using the word_key() function with the Soundex-Simplified algorithm.
Returns
The function returns the word_key code of a word as an integer. These codes can be compared using the word_keys_diff() function. For example:
select word_key('persistent',1);
This example returns the encoding 67106.
word_key_tochar
Returns the varchar representation of the phonetic encoding produced by the word_key function.
6-4
D20484
Rev.1
Description
The word_key_tochar function has the following syntax:
varchar = word_key_tochar(int wordkey [, int algorithm]);
The wordkey value specifies the word_key encoding to be given a varchar representation. Algorithm is one of the following: 0 Soundex-Miracode 1 Soundex-Simplified 2 Soundex-SQLServer 3 Double-Metaphone (default if no algorithm given)
Returns
For example word_keys_tochar(word_keys(Ashcroft, 0), 0) will return A261. For example:
select word_key_tochar(word_key('PERsisteNT',2),2);
This example returns P622.
word_keys_diff
Computes the lexical difference between phonetic encodings produced by the word_key function. Note: Soundex word keys can be compared for an exact match by comparing the int4 keys directly without using this function.
Description
The word_keys_diff function has the following syntax:
int1 = word_keys_diff(int4 wordkey1, int4 wordkey2 [, int algorithm]));
The wordkey1 value specifies the first word_key encoding in the comparison. The wordkey2 value specifies the second word_key encoding in the comparison. Algorithm is one of the following: 0 Soundex-Miracode 1 Soundex-Simplified 2 Soundex-SQLServer 3 Double-Metaphone (default if no algorithm given)
Returns
Soundex will return a value between 0 and 4. 0 represents an exact match. 1-4 represent increasing degrees of inexactness. For example:
select word_keys_diff(word_key('Johnson',0),word_key('Jeppeson',0),0);
This example returns 1 because the two soundex encodings differ by 1 character; the soundex code for Johnson is J525 and the soundex code for Jeppeson is J125.
D20484
Rev.1
6-5
word_stem
Returns the root stem of the given varchar word. (e.g. fishing, fished, fisher all return fish).
Description
The word_key function has the following syntax:
varchar = word_stem(varchar word [, int algorithm]);
The word value specifies the varchar word whose root stem you want. The algorithm value has just one option, 100, which indicates the Porter algorithm. This is the default, so no algorithm need be specified.
Returns
The function returns the root stem of the given varchar word. For example:
select word_stem('fishing'); select word_stem('fisher');
Both of these examples return fish.
Regular Expression Function Reference

The supported regular expression functions are full Perl v5 compatible. A discussion of how regular expressions operate is beyond the scope of this document. For more information, refer to the many texts available that discuss how to construct Perl regular expressions.
The Flags Argument

The functions described in this section all take a flags argument. The flags argument can contain any of the following: Table 6-2: Flags used in Regular Expressions Functions Flag m Short Description Multi-line Full Description Specifies that the input data may contain more than one line so that the ^ and the $ matches should take that into account. Equivalent to the Perl /m option Matching should take place without considering case. Equivalent to the Perl /i option. The default and opposite of the i parameter. Specifies that the . character should match newlines. Equivalent to the Perl /s option. Included for compatibility with vendors that use n flag. White space data characters are ignored unless escaped. Equivalent to the Perl /x option.
i c s
Case insensitive Case sensitive Dot All
n x
Equivalent to the s parameter. Extended
6-6
D20484
Rev.1
regexp_extract
Pulls out the matching text item. Note: Analogous to the REGEXP_SUBSTR() function provided by some vendors.
Description
The regexp_extract function has the following syntax:
varchar = regexp_extract(varchar input, varchar pattern [, int start_pos [, int reference]] [, varchar flags]);
The input value specifies the varchar on which the regular expression is processed The pattern value specifies the regular expression. The start_pos value specifies the character position at which to start the search (defaults to position 1). The reference value specifies which instance of the pattern to extract (defaults to 1). For a description of flags, see The Flags Argument on page 6-6.
Returns
For example:
select regexp_extract(hello to you, .o,1,1); select regexp_extract(hello to you, .o,1,2); select regexp_extract(hello to you, .o,1,3);
This first example returns lo, the second returns to, and the third returns yo.
regexp_extract_all
Pulls out all the matching text items and returns them in a varchar array.
Description
The regexp_extract_all function has the following syntax:
array(varchar) = regexp_extract_all(varchar input, varchar pattern [, int start_pos] [, varchar flags]);
The input value specifies the varchar on which the regular expression is processed. The pattern value specifies the regular expression. The start_pos value specifies the character position at which to start the extract (defaults to position 1) For a description of flags, see The Flags Argument on page 6-6.
Returns
For example:
select array_combine(regexp_extract_all('Steven .Stephen are best player','Ste(v|ph)en'),'|');
This example returns

Steven|Stephen
D20484
Rev.1
6-7
regexp_extract_all_sp
Processes the specified regular expression on the varchar input. All sub-patterns are returned in an array with the first element (element 0) corresponding to the full match.
Description
The regexp_extract_all_sp function has the following syntax:
array(varchar) = regexp_extract_all_sp(varchar input, varchar pattern [, int start_pos][, varchar flags]);
The input value specifies the varchar on which the regular expression is processed. The pattern value specifies the regular expression. The start_pos value specifies the character position at which to start the extract (defaults to position 1). For a description of flags, see The Flags Argument on page 6-6.
Returns
For example:
select array_combine(regexp_extract_all_sp('Robert Szissel, 128 Folson St, Boston', '([^,]*),[[:space:][:digit:]]*([^[:space:]]*).*,[[:space:]]*(.*)'),'|' );
This example returns Robert Szissel, 128 Folson St, Boston|Robert Szissel|Folson|Boston
regexp_extract_sp
Processes the specified regular expression on the varchar input, returning the specified sub-pattern.
Description
The regexp_extract_sp function has the following syntax:
varchar = regexp_extract_sp(varchar input, varchar pattern , int start_pos , int reference[, varchar flags]);
The input value specifies the varchar on which the regular expression is processed The pattern value specifies the regular expression. The start_pos value specifies the character position at which to start the extract (defaults to position 1). The reference value specifies which instance of the pattern to extract. For a description of flags, see The Flags Argument on page 6-6.
Returns
For example, consider the following database:
6-8
D20484
Rev.1
create table sample(col1 varchar(20)); CREATE TABLE insert into sample values('bcaaabc'); INSERT 0 1 insert into sample values('abcbc'); INSERT 0 1 insert into sample values('bbb'); INSERT 0 1 insert into sample values('bcd'); INSERT 0 1 insert into sample values('bccdebc'); INSERT 0 1 insert into sample values('def'); INSERT 0 1 insert into sample values('efgbcbc'); INSERT 0 1
And consider the following query executed against this table:

select regexp_extract_sp ( col1, '[acf]' ,1,1)from sample order by rowid;
This example returns 7 rows:

c a
c c f f
regexp_instr
Pulls out the index of the matching text item.
Description
The regexp_instr function has the following syntax:
int = regexp_instr(varchar input, varchar pattern [, int start_pos [, int reference]] [, varchar flags]);
The input value specifies the varchar on which the regular expression is processed The pattern value specifies the regular expression. The start_pos value specifies the character position at which to start the search for a match (defaults to position 1). The reference value indicates a specific instance of the pattern. For a description of flags, see The Flags Argument on page 6-6.
D20484
Rev.1
6-9
Returns
If there is no match, or else if there are less than reference occurrences of the pattern, this will return 0. For example:
select regexp_extract(hello to you, .o,1,1); select regexp_extract(hello to you, .o,1,2); select regexp_extract(hello to you, .o,1,3);
This first example returns 4, the second returns 7, and the third returns 10.
regexp_like
Returns true if there is at least one matching occurrence in input.
Description
The regexp_like function has the following syntax:
bool = regexp_like(varchar input, varchar pattern [, int start_pos] [, varchar flags]);
The input value specifies the varchar on which the regular expression is processed. The pattern value specifies the regular expression. The start_pos value specifies the character position at which to start the search for a match (defaults to position 1). For a description of flags, see The Flags Argument on page 6-6.
Returns
For example:
select regexp_like('my password is 09124 or 069az6','[0-9][^0-9]+[09]$');
This example returns true.
regexp_match_count
Returns the number of matching occurrences in input.
Description
The regexp_match_count function has the following syntax:
int = regexp_match_count(varchar input, varchar pattern [, int start_pos] [, varchar flags]);
The input value specifies the varchar on which the regular expression is processed. The pattern value specifies the regular expression. The start_pos value specifies the character position at which to start the search for a match (defaults to position 1). For a description of flags, see The Flags Argument on page 6-6.
6-10
D20484
Rev.1
Returns
For example:
select regexp_match_count('Steven Jones and Stephen Smith are the best players','Ste(v|ph)en');
regexp_replace
Replaces each instance of pattern in input with the value in the varchar replacement.
Description
The regexp_replace function has the following syntax:
varchar = regexp_replace(varchar input, varchar pattern, varchar replacement [, int start_pos [, int reference]] [, varchar flags]);
The input value specifies the varchar on which the regular expression is processed The pattern value specifies the regular expression. The replacement value specifies the value to substitute for each instance of pattern. The start_pos value specifies the character position at which to start the replace (defaults to position 1) The reference value specifies which instance of the pattern to replace. For a description of flags, see The Flags Argument on page 6-6.
Returns
If reference is set to 0 (or not specified) then all occurrences of the string will be replaced. For example:
select regexp_replace('Awake! Fear, Fire, Foes!','Foes','Flee');

Awake! Fear, Fire, Flee!
regexp_replace_sp
Processes the specified regular expression on the varchar input and replaces each instance of a sub-pattern with the values in the array replacements.
Description
The regexp_replace_sp function has the following syntax:
varchar = regexp_replace_sp(varchar input, varchar pattern, array replacements [, int start_pos] [, varchar flags]);
The input value specifies the varchar on which the regular expression is processed The pattern value specifies the regular expression. The replacement array specifies the values to substitute for each instance of the subpattern. The start_pos value specifies the character position at which to start the replace (defaults to position 1)
D20484
Rev.1
6-11
For a description of flags, see The Flags Argument on page 6-6.
Returns
For example:
select regexp_replace_sp('Robert Szissel, 128 Folson St, Boston', '([[:digit:]]+)[^.]*,.*(Boston)', array_split('37000,Cleveland', ','));

Robert Szissel, 37000 Folson St, Cleveland
6-12
D20484
Rev.1
CHAPTER 7
Text Utility
Text Utility Function Reference
The text utility functions in this chapter enable you to convert between ASCII hexadecimal and ASCII, substitute substrings, and extract substrings.

Functions are listed alphabetically.
hextoraw
Interprets each pair of characters (left to right) in the input varchar as the hexadecimal code for an ASCII character and converts the hexadecimal sequence into a character string.
Description
The hextoraw function has the following syntax:
varchar = hextoraw(varchar input);
The input value specifies the varchar to convert.
Returns
For example:
SELECT hextoraw(68656C6C6f);
This example returns the varchar: hello
rawtohex
Converts a character string into the ASCII hexadecimal representation.
Description
The rawtohex function has the following syntax:
varchar = rawtohex(varchar input);
The input value specifies the varchar to convert.
7-1
Returns
For example:
SELECT rawtohex(hello);
This example returns the varchar: 68656C6C6F
replace
Replaces each instance of pattern in input with the value in the varchar replacement.
Description
The replace function has the following syntax:
varchar = replace(varchar input, varchar pattern, varchar replacement);
The input value specifies the varchar in which the characters are replaced. The pattern value specifies the characters to replace. The replacement value specifies the characters to substitute for each instance of pattern.
Returns
For example:
select replace('persisaent','a','t');

"persistent"
strleft
Returns the left-most n characters from the varchar input.
Description
The strleft function has the following syntax:
varchar = strleft(varchar input, int n);
The input value specifies the varchar from which the characters are returned. The n value specifies the number of characters to return.
Returns
For example:
Select strleft ('1234567891',5)

"12345"
7-2
D20484
Rev.1
strright
Returns the right-most n characters from the varchar input.
Description
The strright function has the following syntax:
varchar = strright(varchar input, int n);
The input value specifies the varchar from which the characters are returned. The int value specifies the number of characters to return.
Returns
For example:
Select strright ('1234567891',5)

"67891"
D20484
Rev.1
7-3
7-4
D20484
Rev.1
CHAPTER 8
Array
Array Function Reference
The array functions in the Netezza SQL Extensions Toolkit rely on the array data type. Because the Netezza database currently does not support user-defined types, the array type is stored in a varchar field. The maximum size of a varchar field is 64000 bytes. The array type consists of a sequence of name-value pairs. Names can be a maximum of 40 characters in width. Values can be any built-in SQL type, but must be the same type for the entire array. Elements can be referenced by either name or by the 1-based index.

add_element
Appends a new array element to the end of the input array and assign it the specified value. This is an overloaded function, with 7 forms corresponding to the 7 data types.
Description
The syntax of the add_element function has eight forms, one for each data type:
array = add_element(array input, varchar value [, varchar name]) array = add_element(array input, nvarchar value [, varchar name]) array = add_element(array input, int8 value [, varchar name]) array = add_element(array input, double value [, varchar name]) array = add_element(array input, time value [, varchar name]) array = add_element(array input, date value [, varchar name]) array = add_element(array input, timestamp value [, varchar name]);
The input value specifies the array to which the element is appended. The value value specifies the value to store in the new array element. The optional name value specifies the name of the array element being appended.
8-1
Returns
For example:
add_element(my_array, 45)
Assuming my_array has four elements, then this example appends a fifth element to the end of the array and stores the value 45 in that element
array
Creates an array of the given type.
Description
The array function has the following syntax:
array = array(int type);
The type value specifies the type of array to create. The type takes an integer code between 1 and 11 that indicates the type, as shown in the following table: Table 8-1: Array Types Code 1 2 3 4 5 6
Type
Size 8 bit 16 bit 32 bit 64 bit Ranging from January 1, 0001, to December 31, 9999. Disk Usage: 4 bytes Hours, minutes, and seconds to 6 decimal positions. Ranging from 00:00:00.000000 to 23:59:59.999999. Disk Usage: 8 bytes Has a date part and a time part, with seconds stored to 6 decimal positions. Ranging from January 1, 0001 00:00:00.000000 to December 31, 9999 23:59:59.999999. Disk Usage: 8 bytes Variable length to a maximum length of n. No blank padding, stored as entered. The maximum character string size is 64,000. Uses N+2 or fewer bytes depending on the data. Variable-length Unicode data with a maximum length of 16000 characters. Using UTF-8 encoding, each Unicode code point requires 1-4 bytes of storage. So a 10-character string requires 10-bytes of storage if it is ASCII, up to 20 bytes if it is Latin, or as many as 40 bytes if it is pure Kanji (but typically 30 bytes). Floating point number with precision 1 to 15. Precision less than 6 uses 4 bytes. Precision between 7 and 15 uses 8 bytes. Equivalent to float with precision 15, using 8 bytes
Int1 Int2 Int4 Int8 Date Time
Timestamp
Varchar
NvarChar
10 11
Float Double
8-2
D20484
Rev.1
Returns
For example:
create table array_t(col1 int,col2 varchar(100));
array_combine
Combines the array elements in the array input into a single varchar delimited by delimiter.
Description
The array_combine function has the following syntax:
varchar = array_combine(array input, char delimiter);
The input value specifies the array to decompose into a single varchar. The delimiter value specifies the delimiter that distinguishes the array elements.
Returns
For example:
select array_combine(col2,'|')from array_t;
A possible return value might be:

12|23
array_concat
Concatenates two arrays, creating a new array that contains all the elements in the first array followed by all the elements in the second array. Note: The two arrays must be of the same type and element names cannot be the same.
Description
The array_concat function has the following syntax:
array = array_concat(array array1, array array2);
The array1 value specifies the first of the two arrays to concatenate. The array2 value specifies the second of the two arrays to concatenate.
Returns
For example:
select (array_concat (array(2),array(2)));
array_count
Returns the number of elements in the array.
Description
The array_count function has the following syntax:
int = array_count(array input);
The input value specifies the array in which to count elements.
D20484
Rev.1
8-3
Returns
For example:
select array_count(col2)from array_t;
A possible return value might be:

2
array_split
Parses the input for elements separated by a delimiter to create an array.
Description
The array_concat function has the following syntax:
array = array_split(varchar input, varchar delimiter [, [int type]);
The input value specifies a character delimited list of elements. The delimiter value specifies the delimiter used in the input. The optional type value specifies the type of the array; the type defaults to varchar.
Returns
For example:
select array_combine(array_split('1,2,3,4,5,6,7,8',','),'|');

1|2|3|4|5|6|7|8
array_type
Returns the type of the array.
Description
The array_type function has the following syntax:
int = array_type(array input);
The input value specifies the array for which to get the type.
Returns
For example:
select array_type(array(4));
This example returns 4: This second example determines the array type of an array that is stored in a table:
select array_type(col2)from array_t;
8-4
D20484
Rev.1
delete_element
Deletes an element from the input array.
Description
The syntax for the delete_element function supports deleting by name or by index:
array = delete_element(array input, int index); array = delete_element(array input, varchar name);
The input value specifies the array which contains the element to delete. The index value specifies the index of the element to delete from the input array. The name value specifies the name of the element to delete from the input array.
Returns
For example:
select delete_element(col2,1)from array_t;
element_name
Returns the name of an element if it exists.
Description
The element_name function has the following syntax:
varchar = element_name(array input, int index);
The input value specifies the array which contains the named element. The index value specifies the element for which to retrieve the name.
Returns
For example:
select element_name(add_element(array(4),4,'Netezza'),1);

Netezza
get_value_type
Retrieves the value stored in the specified array element. The name of the function is of the form get_value_type, where type is the data type of the element to retrieve, for example get_value_varchar. There are seven data types, but there are two versions of the function for each data type, enabling you to retrieve array elements by index or by name.
Description
The get_value_type function has the following syntax:
varchar = get_value_varchar(array input, int index); varchar = get_value_varchar(array input, varchar name); nvarchar = get_value_nvarchar(array input, int index); nvarchar = get_value_nvarchar(array input, varchar name);
D20484
Rev.1
8-5
int8 = get_value_int(array input, int index); int8 = get_value_int(array input, varchar name); double = get_value_double(array input, int index); double = get_value_double(array input, varchar name); time = get_value_time(array input, int index); time = get_value_time(array input, varchar name); date = get_value_date(array input, int index); date = get_value_date(array input, varchar name); time_tz = get_value_timestamp(array input, int index); time_tz = get_value_timestamp(array input, varchar name);
The input value specifies the array which contains the element to retrieve. The index value specifies the index of the element to retrieve from the input array. The name value specifies the name of the element to retrieve from the input array.
Returns
This function attempts to perform type conversion if the specified element is of a different type than the function returns. If unsuccessful in conversion, or if the element does not exist, it will return an error. For example:
select get_value_int(col2,1)from array_t;
A possible return value might be 12:
replace_element
Replaces an array element in the input array. This is an overloaded function, with 14 forms corresponding to the 7 data types (by name or by array index).
Description
The syntax of the add_element function has 16 variations, two for each of the 8 data types (one for referencing an element by name and one for referencing an element by index):
array = replace_element(array input, int index, varchar value) array = replace_element(array input, varchar name, varchar value) array = replace_element(array input, int index, nvarchar value) array = replace_element(array input, varchar name, nvarchar value) array = replace_element(array input, int index, int8 value) array = replace_element(array input, varchar name, int8 value) array = replace_element(array input, int index, double value) array = replace_element(array input, varchar name, double value) array = replace_element(array input, int index, time value) array = replace_element(array input, varchar name, time value) array = replace_element(array input, int index, date value) array = replace_element(array input, varchar name, date value) array = replace_element(array input, int index, timestamp value) array = replace_element(array input, varchar name, timestamp value);
8-6
D20484
Rev.1
The input value specifies the array in which the element is replaced. The index value specifies the position in the array at which the element is replaced. The name value specifies the name of the array element to replace. value specifies the new value for the specified array element.
Returns
For example:
select replace_element(col2,1,15)from array_t;
D20484
Rev.1
8-7
8-8
D20484
Rev.1
CHAPTER 9
Collection
User Type Collection Collection Function Reference
Collections are useful for grouping together heterogeneous information; in other words, information of different data types can be stored in each element in the collection, unlike arrays in which each element must be of the same data type.
User Type Collection

A new user type, collection is defined in this section. The collection type consists of a sequence of name-value pairs. Names can be a maximum of 40 characters in width. Values can be any built-in SQL type. Elements can be referenced by either name or by 1 based index. Because the Netezza user defined functions (UDFs) currently do not support new user types, the collection type will be loaded into a varchar field.
Collection Function Reference

In addition to the two functions listed in this section, you can use any of the array functions listed in Chapter 8, Array to retrieve and manipulate collection elements.
collection
Creates an empty collection.
Description
The collection function has the following syntax:
collection = collection();
Returns
For example:
create table collection_t(col1 int, col2 varchar(100));
9-1
element_type
Returns the type of the collection element.
Description
The element_type function has the following syntax:
int = element_type(collection input, int index); int = element_type(collection input, varchar name);
The input value specifies the collection. The index value specifies the index of the element to find the type of. The name value specifies the name of the element to find the type of.
Returns
For example:
select element_type(col2,1)from collection_t;
Assuming an element of type INT4, the example returns 4.
9-2
D20484
Rev.1
C H A P T E R 10
Miscellaneous
Miscellaneous Function Reference
This chapter contains those functions that do not fit neatly into the functional groupings in the preceding chapters of this manual.

greatest
Returns the largest of the input values, up to a maximum of four (variable length lists are not supported).
Description
The syntax of the function has three forms, depending on the data type of the values being compared:
int4 = Greatest(int4 value1, int4 value2, ...); int8 = Greatest(int8 value1, int8 value2, ...); double = Greatest(double value1, double value2, ...);
The value1 value specifies the first input to compare. The value2 value specifies the second input to compare. The value3 value specifies the third input to compare. The value4 value specifies the fourth input to compare
Returns
For example:
select greatest(12,45,85);
10-1
least
Returns the smallest of the input parameters, up to a maximum of four (variable length lists are not supported).
Description
The syntax of the function has three forms, depending on the data type of the values being compared:
int4 = Least(int4 value1, int4 value2, ...); int8 = Least(int8 value1, int8 value2, ...); double = Least(double value1, double value2, ...);
The value1 value specifies the first input to compare. The value2 value specifies the second input to compare. The value3 value specifies the third input to compare. The value4 value specifies the fourth input to compare
Returns
For example:
select least(14,45,75);
mt_random
Returns a pseudo-random number between 0.0 and 1.0 using the Mersenne Twister pseudo-random number generator, an open source library that quickly generates high quality pseudo-random numbers with a period of 219937 and very good distribution. The pseudo-random numbers are excellent for simulations, such as Monte Carlo simulations, as well as for polling, for example providing a random sample of 1000 records from a table of one million records. This algorithm by itself is not suitable for cryptography because as few as 624 iterations are required to predict all future iterations. Wrapping this function with a hash function is likely sufficient to provide cryptographically secure random numbers. Note: NPS offers a built-in random() function which is based on the Linear Congruential Generator algorithm. The Mersenne Twister algorithm is often favored for certain randomness applications.
Description
The mt_random function has the following syntax:
mt_random = mt_random();
Returns
The function returns a pseudo random number between 0.0 and 1.0. The following example pulls a very well distributed random sample of 10 records from the Customer_Table:
SELECT * FROM Customer_Table ORDER BY mt_random() LIMIT 10;
10-2
D20484
Rev.1
corr
This aggregate function returns the correlation coefficient of the set of inputa to inputb.
Description
The corr function has the following syntax:
double = corr(Set(double) inputa, Set(double) inputb);
The inputa value specifies the first in the set. The inputb value specifies the next in the set.
Returns
For example, assuming a table function_t with following values 1.2, 1.4, and 1.6 in col1 and the values1.4, 1.6, and 1.8 in col2:
select corr(col1,col2)from function_t;
This example returns 1:
covar_pop
This aggregate function returns the population-based covariance of the set of number pairs inputa and inputb.
Description
The covar_pop function has the following syntax:
double = covar_pop(Set(double) inputa, Set(double) inputb);
The inputa value specifies the first number of the set. The inputb value specifies the next number of the set.
Returns
select covar_pop(col1,col2)from function_t;

0.026666666666667
covar_samp
This aggregate function returns the sample-based covariance of the set of number pairs inputa and inputb.
Description
The covar_samp function has the following syntax:
double = covar_samp(Set(double) inputa, Set(double) inputb);
The inputa value specifies the first number of the set. The inputb value specifies the next number of the set.
D20484
Rev.1
10-3
Returns
select covar_samp(col1,col2)from function_t;

0.040000000000001
10-4
D20484
Rev.1
Index
Index
A
accented characters 6-1 add_element 8-1 Adler algorithm 4-3 Advanced Encryption Standard 3-2 AES 3-2 Algorithms AES 3-2 algorithms Adler 4-3 CRC32 4-3 Damerau-Levenshtein 6-1 DEFLATE 3-1 Double-Metaphone 6-1, 6-3, 6-4, 6-5 Jenkins 4-3 MD5 4-2 Mersenne Twister 10-2 SHA 4-2 Soundex 6-1, 6-3, 6-4, 6-5 array data type 8-1, 8-2 array function 8-2 array_combine 8-3 array_concat 8-3 array_count 8-3 array_split 8-4 array_type 8-4 ASCII 6-1, 6-3 ASCII to hexadecimal conversions 7-1
D
Damerau-Levenshtein algorithm 6-1 data transformation functions 3-1 data type array 8-1, 8-2 array elements 8-2 collection 9-1 converting in XMLExtractValue 2-12 date 5-1 implicit conversion of date and time 5-1 SQL 2-2 time 5-1 timestamp 5-1 type checking 6-1 user defined types 2-2 database, registering SQL Extension functions in 1-2 date data type 5-1 day function 5-1 days_between 5-2 dbl_mp 6-4 decompress function 3-2 decrypt function 3-2 DEFLATE compression algorithm 3-1 delete_element 8-5 detecting duplicated records 4-1 dle_dst 6-2 double function 10-2 Double-Metaphone algorithm 6-1, 6-3, 6-4, 6-5 duplicate records, detecting 4-1
B
backups, for SQL Extensions toolkit 1-6
E
element_name 8-5 element_type 9-2 encrypt function 3-2 encryption private key 3-2 secret key 3-2 symmetric 3-2 examples XMLAgg 2-4 XMLConcat 2-4 XMLElement 2-2, 2-3, 2-4 XMLSerialize 2-2 expressions, XPath 2-7
C
characters, accented 6-1 checksum hash function 4-3 checksums 4-1 collection data type 9-1 compress function 3-1 conversion ASCII to hexadecimal 7-1 hexadecimal to ASCII 7-1 corr function 10-3 correlation coefficient 10-3 covar_pop 10-3 covar_samp 10-3 covariance population based 10-3 sample-based 10-3 CRC32 algorithm 4-3 cryptographic hash function 4-2 cryptography 4-1
F
fuzzy comparisons 6-1
G
get_value_date 8-5 get_value_double 8-5 get_value_int 8-5 get_value_nvarchar 8-5 get_value_time 8-5
Index-1
Index
get_value_timestamp 8-5 get_value_timetz 8-5 get_value_varchar 8-5 greatest function 10-1
N
Netezza SQL Extensions Toolkit backups and restores 1-6 disabling in a database 1-4 displaying version 1-4 installing 1-2 obtaining 1-1 registering functions in a database 1-2 removing 1-5 upgrading 1-4 next_month 5-4 next_quarter 5-4 next_year 5-4 nysiis 6-4 NzAdmin screenshot with functions 1-3
H
hash function 4-2 cryptographic 4-2 hash functions checksum 4-3 lookup 4-3 lookups 4-3 hash table 4-1 hash4 4-3 hash8 4-3 hexadecimal to ASCII conversions 7-1 hextoraw 7-1 hour function 5-2 hours_between 5-2
O
ODBC conversations 3-3
I
installation instructions 1-2 ISO/IEC 9075-14 2-1 IsValidXML 2-5, 2-8 IsXML 2-8
P
passwords, verifying 4-1 pattern matching 6-1 Perl 5 regular expressions 6-6 pg.log file 3-3 phonetic comparisons 6-1, 6-3 population-based covariance 10-3 Porter algorithm algorithms Porter 6-3 private key encryption 3-2 pseudo-random number 10-2 publishing XML data 2-2
J
JDBC conversations 3-3 Jenkins algorithm 4-3
K
key search attacks 3-2
R
random function 10-2 random number generator 10-2 range checks 6-1 rawtohex 7-1 regexp_extract 6-7 regexp_extract_all 6-7 regexp_extract_all_sp 6-8 regexp_extract_sp 6-8 regexp_instr 6-9 regexp_like 6-10 regexp_match_count 6-10 regexp_replace 6-11 regexp_replace_sp 6-11 regular expressions 6-1 flags argument 6-6 overview 6-6 removal instructions 1-5 replace function 7-2 replace_element 8-6 restores, for SQL Extensions toolkit 1-6
L
le_dst 6-2 least function 10-2 Levenshtein algorithm algorithms Levenshtein 6-2, 6-3 lexical comparisons 6-1 libnetsqlextensions.tar.gz file, untarring 1-2 license information 1-1 locating spatial points 4-1 lookup hash function 4-3 lookups 4-1
M
MD5 algorithm 4-2 Mersenne Twister algorithm 10-2 messages, verifying integrity 4-1 minute function 5-3 minutes_between 5-3 month function 5-3
Index-2
Index
S
sample-based covariance 10-3 second function 5-5 seconds_between 5-5 secret key encryption 3-2 SHA algorithm 4-2 Soundex algorithm 6-1, 6-3, 6-4, 6-5 spatial points, locating 4-1 SQL 2003 2-1, 2-2 SQL Extension functions, registering 1-2 SQL Functions toolkit disabling 1-4 strleft 7-2 strright 7-3 symmetric encryption 3-2 system prerequisites 1-1
X
XML data type 2-2, 2-13, 2-14 XML data, publishing 2-2 XML examples XMLAgg 2-4 XMLConcat 2-4 XMLElement 2-2, 2-3, 2-4 XMLSerialize 2-2 XML functions, nesting 2-3 XML standalone property 2-14 XML version property 2-14 XMLAGG 2-9 XMLAgg 2-1, 2-2 XMLAttributes 2-1, 2-2, 2-10 XMLConcat 2-1, 2-2, 2-10 XMLElement 2-1, 2-2, 2-11 XMLExistsNode 2-1, 2-11 XMLExtract 2-1, 2-12 XMLExtractValue 2-1, 2-12 XMLParse 2-13 XMLRoot 2-1, 2-14 XMLSerialize 2-14 XMLUpdate 2-1, 2-15 XPath expressions 2-7
T
text fuzzy comparisons 6-1 lexical comparisons 6-1 phonetic comparisons 6-1, 6-3 regular expressions 6-1 this_month 5-5 this_quarter 5-6 this_week 5-6 this_year 5-6 time data type 5-1 timestamp data type 5-1 transliterating accented characters 6-1 type checks 6-1
Y
year function 5-7
Z
zlib library 3-1 zone maps 4-2
U
UDFs 2-1, 2-2 uninstall instructions 1-5 user accounts, permissions 1-4 user defined types 2-2 uudecode 3-4 uuencode 3-3
V
verifying message integrity 4-1 verifying passwords 4-1 version, displaying for SQL Extensions toolkit 1-4
W
weeks_between 5-7 word_diff 6-1 word_find 6-2 word_key 6-3 word_key_tochar 6-4 word_keys_diff 6-5 word_stem 6-6
Index-3
Index
Index-4

Netezza SQL Toolkit

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Netezza SQL Toolkit

Uploaded by

Copyright:

Available Formats

NPS SQL Extensions Toolkit Users Guide

5 Date and Time Comparisons

About This Guide

Text Utility on page 7-1

See Array on page 8-1

Symbols and Conventions

If You Need Help

Netezza Welcomes Your Comments

NPS Administration Information

NPS System Prerequisites

NPS SQL Extensions Toolkit Users Guide

Installing the Netezza SQL Extensions Toolkit

Enabling SQL Functions Support in a Database

NPS Administration Information

NPS SQL Extensions Toolkit Users Guide

User Account Permissions and Requirements

Displaying the SQL Extensions Toolkit Version

Sample output follows:

Sample output follows:

Upgrading the SQL Extensions Toolkit

Disabling the SQL Extensions Toolkit in a Database

NPS Administration Information

Removing the SQL Extensions Toolkit

Using Different Versions of the SQL Extensions Toolkit

NPS SQL Extensions Toolkit Users Guide

Best Practices for Backups and Restores of the NPS Data

The workaround is to use rtrim() on the CHAR column, for example:

The workaround is to replace the element by index instead. For example:

Arrays of type timetz are not supported.

NPS SQL Extensions Toolkit Users Guide

User Type XML

Getting Started: Publishing SQL Data as XML

This creates the following XML:

Getting Started: Publishing SQL Data as XML

This query produces the following XML:

This query produces the following XML:

NPS SQL Extensions Toolkit Users Guide

Getting Started: Publishing SQL Data as XML

NPS SQL Extensions Toolkit Users Guide

Using XPath Expressions

Using XPath Expressions

a slash ( / ) it always represents an absolute path to an element.

NPS SQL Extensions Toolkit Users Guide

XML Function Reference

The input value specifies the character string to analyze.

The input value specifies the XML object to analyze.

This example returns true.

XML Function Reference

NPS SQL Extensions Toolkit Users Guide

This query produces an XML result fragment. For example:

XML Function Reference

This example returns:

NPS SQL Extensions Toolkit Users Guide

An example of a possible return value is as follows:

XML Function Reference

The input value specifies a varchar representation of an XML input object.

NPS SQL Extensions Toolkit Users Guide

This example returns:

XML Function Reference

NPS SQL Extensions Toolkit Users Guide

Data Transformation Function Reference

NPS SQL Extensions Toolkit Users Guide

This example returns: