Ingres 11 Technical Preview
Written by Nikos Vaggalis   
Tuesday, 21 June 2016

Version 11 of Actian's Ingres open source RDBMS, is going to be a major release as indicated by the jump in the version number. The sequence goes 10.0.0, 10.1.0, 10.2 and from there to 11. This heralds quite a few changes and welcome additions.

The technical preview available at the moment provides only the bare minimum of information, so we dug deeper and can now reveal the most important new features, as far as developers are concerned, broken down by category. 

Drivers/Connectivity

  • Version 11 comes bundled with JDBC driver version 4.3 that supports Java8 and above.  

  • DAS or Data Access Server, the component that enables connectivity for .NET client application, replaces the JDBC server, as such unifying access of both JDBC and .NET clients under DAS.

  • Network level compression is enabled by default on JDBC and .NET client connections, and can be explicitly disabled by setting the connection attribute "compress=off."

  • The .NET data provider is integrated with VS2015, while a new utility, 'dotnetinfo', makes the file version of the .NET Data Provider (Ingres.Client.dll) easily available, a necessity when communicating with Actian's support desk in case of troubleshooting. 

Diagnostics/Debugging

  • SET SERVER_TRACE, SET SESSION_TRACE replaces trace point sc930. Trace points are low level instructions that force the client or server to emit more information on their internal workings, useful for performance tuning and troubleshooting.

sc930 or Query Tracing, gives the ability to trace all queries processed by the DBMS, capturing them in a textual format that permits their manual replaying, so that you get insight on which queries are run and how they're processed. sc930 is usable with older clients like those constructed with ABF (Applications by Forms, an Ingres propriety 4GL UI/green screen design tool) but wouldn't work with newer client technologies like JDBC, ODBC or .Net. As such the new SET SERVER_TRACE, SET SESSION_TRACE directives unify the tracing functionality for all kind of clients. 

SQL enhancements

Hashing functions generate a fixed length, that varies depending on the function used, "hash" value of the data passed in as argument. The following hashing functions find their way into version 11:

  • MD5SUM(c1)  returns the calculated MD5 message digest of its argument (c1) in CHAR(32) format type and length.

  • SHA1SUM(expr) returns the SHA-1 checksum of the input data, in CHAR(40) format type and length. SHA224SUM, SHA256SUM, SHA384SUM and SHA512SUM work in a similar way.
  • CREATE TABLE IF NOT EXISTS brings Ingres up to date with other RDBMSs.

  • AES symmetric encryption for BLOB (binary large object) columns based on an AES key size of 128, 192, 256 bits,(128 by default), and a user provided pass phrase;

  • COMMENT ON [VIEW] The COMMENT ON statement which creates a comment on a table, or column, displayable through issuing the HELP COMMENT statement from an isql terminal monitor, now adds support for View objects as well.

  • BOOLEAN to INTEGER coercions coerces 'true' to 1 and 'false' to 0.

  • IS NOT DISTINCT FROM operator

Normally a NULL cannot be compared to any other value, including another null value. This new operator changes the game by allowing comparison operations to treat NULLs as equals while comparing the rest of the values with =. So an expression such as:

WHERE columna IS NOT DISTINCT FROM columnb

evaluates to true if both columna and columnb are equal or both columna and columnb are NULL. The opposite, IS DISTINCT FROM, treats NULL as equals but compares the rest with <>.  

 

Windowing Functions

  • Addition of functions lag() and lead().

Lag or Lead return a scalar column value from a row, before (Lag) or after (Lead) the current row, as such useful for comparing values across multiple rows. For example Lag compares values in the current row with values in a previous row, while LEAD compares values in the current row with values in a following row.

  • Scalar subquery change in behaviour

In previous releases, the following two forms of syntax were equivalent:

 

SELECT col1_int, col2_float 
FROM session.vps18_gtt2
WHERE col1_int = (SELECT col1_int FROM session.vps18_gtt1
                  WHERE (col1_int>=11100 AND col1_int<=11115)) 
ORDER BY col1_int;

and

SELECT col1_int, col2_float 
FROM session.vps18_gtt2
WHERE col1_int in (SELECT col1_int FROM session.vps18_gtt1
                   WHERE (col1_int>=11100 AND col1_int<=11115)) 
ORDER BY col1_int;

Since the subquery is scalar, it should select only one column and return exactly one row, as such the second form which uses the IN operator should had triggered an exception in case the query returned multiple rows. However in the previous Ingres versions, the IN operator was reworked into having the same effect as 'equals ANY' which short circuits and returns on the first occurrence of a match. In other words IN worked as:

SELECT col1_int, col2_float 
FROM session.vps18_gtt2
WHERE col1_int = ANY (SELECT col1_int FROM session.vps18_gtt1
                   WHERE (col1_int>=11100 AND col1_int<=11115)) 
ORDER BY col1_int;

This now changes so that in that case, a cardinality violation is fired.

 

  • SET FLOAT_FORMAT 'format'

copydb and unloaddb are two utilities with which you can extract table data into text files, so they can subsequently be imported into another platform/OS or another Ingres version. The problem with those utilities is that when unloading and then re-importing floats there can be loss of precision when the float format of the operation is not set at the command line, hence the default n10.3 for float4 and float8 columns is used. The general syntax of the format argument for floating point is:

-fkxM.N

 -f is literal and signals a floating point format.

k is the data type selector and can be 4 or 8 for float4 or float8 respectively.

x is the format specifier, and can be one of E, F, G, or N.  M is the total field width in characters. 

N is the number of decimal places.
    

Hence to print float4 values in 9 character field with 2 decimal
places of precision, the flag would be -f4N9.2 . 

Therefore SET FLOAT_FORMAT, as in:

SET FLOAT_FORMAT f4F79.38

allows the session wide format to be followed for all subsequent operations, something that renders the action of specifying it on the command line unnecessary.

String functions

This category has seen the most additions in this release, especially with the enchantments of the SoundEx and Substring functions, as well as a new set  of text processing algorithm implementations. So to an already impressive set of algorithms implementations from within SQL including the generation or validation of check digits for the EAN (European Article Number), ISBN (International Standard Book Number), UPC (Universal Product Code), or LUHN and VERHOEFF (used in credit cards and many government identification numbers), the following are also added:

  • DMETAPHONE(c1)  Returns the metaphone code from a varchar string using the double metaphone algorithm.

  • JARO_WINKLER(c1 c2) Calculates the jaro_winkler distance between two VARCHAR strings. This is returned as a float value between 0 and 1, where 0.0 means no similarities and 1.0 means the strings are identical.

  • LEVENSHTEIN(c1 c1) Calculates the Levenshtein distance between two VARCHAR strings. The Levenshtein distance between two strings is the minimum number of changes that need to be made to convert the source string into the target string.

Closely related are the SoundEx improvements. SoundEx is a function for finding like-sounding strings. Ingres already had the standard Russel soundex, as well as the Daitch-Mokotoff SOUNDEX_DM extension, but now it also adds support for SOUNDEX_NYSIIS which returns the NYSIIS Soundex code from a string. This algorithm, devised in 1970 as part of the New York State Identification and Intelligence System (NYSIIS), is part of the New York State Criminal Justice Services, and is 2.7% more accurate than the traditional Russell Soundex, but not as accurate as the Daitch-Mokotoff Soundex.

Substring gives regular expression capabilities, albeit limited, from within SQL. These capabilities are now reinforced by adding support for patterns LIKE, SIMILAR TO, BEGINNING, ENDING

For example, BEGINNING returns the substring from the string that starts with the first occurrence of the pattern, while ENDING returns the substring from the string that ends with the last occurrence of the pattern:

  • SELECT SUBSTRING('Company 2016' BEGINNING 'test') returns NULL
  • SELECT SUBSTRING('Company 2016' BEGINNING 'pan\|20'ESCAPE '\')
    returns 'pany 2016'
  • SELECT SUBSTRING('Company 2016' BEGINNING '20\|pan'ESCAPE '\')
    returns '2016'

  • SELECT SUBSTRING('Company 2016' BEGINNING '20\|pan'ESCAPE '\')
    returns '2016'

LIKE and SIMILAR TO return the substring from the string that best matches the pattern:

  • SELECT SUBSTRING('Company 2016' LIKE 'Company \"%\"' ESCAPE '\')
    returns '2016'

  • SELECT SUBSTRING('Company 2016' SIMILAR TO '% \"[0-9]{4}\"'ESCAPE'\')
    returns '2016'

  • SELECT SUBSTRING('217 Bath Road, Slough, Berkshire, SL1 4AA, UK' SIMILAR TO '%\"[A-Z]{1,2}[0-9]{1,2} [0-9][A-Z]{2}\"%' ESCAPE '\')
    returns 'SL1 4AA'

 Finally, a release wouldn't be complete without bug fixes, thus this one incorporates patches for bugs found in previous release 10.2

There have also been amendments in other areas, such as in configuring network communication buffer sizes, enabling log rotation and improved security auditing, features that  cater for administrators as well as developers and add extra value to an  already robust, fast and open source RDBMS. 

More Information

Actian Ingres

Ingres New Features per Release

Actian Online Documentation Portal

Actian's TPC-H results

Related Articles

Ingres becomes Actian - some questions answered

UK Ingres Users Association Conference (2011)

OpenROAD Brings Data to the Web

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter,subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin

 

Banner


AI Propels Python To Top Language on GitHub
30/10/2024

This year's Octoverse Report reveals how AI is expanding on GitHub and that Python has now overtaken JavaScript as the most popular language on GitHub. The use of Jupyter Notebooks has also surged.



OpenAI Releases Swarm
25/10/2024

OpenAI has released an experimental educational framework for exploring ergonomic, lightweight multi-agent orchestration. Swarm is managed by the OpenAI Solution team, but is not intended to be used i [ ... ]


More News

 

espbook

 

Comments




or email your comment to: comments@i-programmer.info

 

Last Updated ( Tuesday, 21 June 2016 )