Java Language Extensions for SQL Server Open Sourced
Written by Nikos Vaggalis   
Monday, 25 May 2020

Language extensions - that's SQL Server's way of calling Java programs from within T/SQL and it's very useful.

 

SQL Server had for a long time been integrated with the Common Language Runtime under which it could interface with the general programming languages of the .NET framework, allowing for writing stored procedures, triggers, user-defined types, user-defined functions, user-defined aggregates, and streaming table-valued functions using the likes of Visual Basic .NET or C#.

But that's deep integration. For the rest of the languages, not part of the .NET framework, SQL Server can still interact with them at a minimal level through the so-called Language Extensions SDK, for the time being just programs written in Java.

Java support was added with the Microsoft Extensibility SDK for Java  as part of SQL Server 2019 Release Candidate 1 on both Windows and Linux.

The default Java runtime used is Azul's Zulu Open JRE, but you can also use another Java JRE or SDK. There are, however, good reason to opt for the Zulu Java runtime. In order to work around Oracle's licensing traps regarding Java, Microsoft partnered with Azul to provide the Enterprise distribution of Java for free and, most importantly, with support provided by both Microsoft and Azul.

To make it work, you just compile your class as library/.jar, and write a T-SQL stored procedure that calls another one, sp_execute_external_script, which finally makes the call to the Java library as "package.class".

There's a detailed official tutorial on writing a Java class that checks a  set of strings passed in from SQL Server against a regular expression. In more detail, we write a stored procedure that takes an input dataset and a regular expression. This procedure executes sp_execute_external_script with those inputs, which in turn calls our library regex.jar to return the rows that fulfilled the given regular expression. It uses a regular expression [Jj]ava that checks if a text contains the word Java or java.

CREATE OR ALTER PROCEDURE [dbo].[java_regex] @expr nvarchar(200), @query nvarchar(400)
AS
BEGIN
--Call the Java program by giving the package.className in @script
--The method invoked in the Java code is always the "execute" method
EXEC sp_execute_external_script
@language = N'Java'
, @script = N'pkg.RegexSample'
, @input_data_1 = @query
, @params = N'@regexExpr nvarchar(200)'
, @regexExpr = @expr
with result sets ((ID int, text nvarchar(100)));
END
GO

--Now execute the above stored procedure and provide the regular expression and an input query
EXECUTE [dbo].[java_regex] N'[Jj]ava', N'SELECT id, text FROM testdata'
GO

This extensibility goes a long way - you can use it in all kinds of scenarios such as calling REST APIs from within SQL server directly. In other words, you won't have to write a Java program that interfaces with the database through the JDBC to fetch the data that is then used to call a RESTful service in order get some return values back. You can even work with JSON since SQL Server supports JSON handling inherently. So from within your T-SQL you just directly call the Java program that interfaces with the API.

Security wise, it's also important to note that the Java program would execute in the context of a SQL Server query as the same SQL Server sandboxing safeguards apply there too.

The latest news is that Microsoft went one step further and open sourced this Java Language Extension in an effort to motivate the open source community to contribute to the project.

So while not a tight integration with the innards of SQL Server through the CLR, Language Extensions prove immensely valuable. DBMSs have gone a long way integrating advanced functionality found in general programming languages or libraries, but can't do everything. We've seen such an example in "Ingres 11 Technical Preview" when covering the news about Actian's new version of the venerable DBMS.

So in that version there were added things like MD5SUM,SHA1SUM,AES symmetric encryption for BLOB.Also String functions like DMETAPHONE(c1) which returns the metaphone code from a varchar string using the double metaphone algorithm, JARO_WINKLER(c1 c2) which calculates the jaro_winkler distance between two VARCHAR strings,
LEVENSHTEIN(c1 c1) which calculates the Levenshtein distance between two VARCHAR strings (that is, the minimum number of changes that need to be made to convert the source string into the target string), Daitch-Mokotoff SOUNDEX_DM or SOUNDEX_NYSIIS which returns the NYSIIS Soundex code from a string.

Also additions on the regular extensions front, by supporting patterns with LIKE, SIMILAR TO, BEGINNING, ENDING.

For example, BEGINNING returns the substring from the string that starts with the first occurrence of the pattern, while ENDING returns the substring from the string that ends with the last occurrence of the pattern:

 

SELECT SUBSTRING('Company 2016' BEGINNING 'test') //returns NULL

SELECT SUBSTRING('Company 2016' BEGINNING 'pan\|20'ESCAPE '\')
//returns 'pany 2016'

SELECT SUBSTRING('Company 2016' BEGINNING '20\|pan'ESCAPE '\')
//returns '2016'

SELECT SUBSTRING('Company 2016' BEGINNING '20\|pan'ESCAPE '\')
//returns '2016'

 

LIKE and SIMILAR TO return the substring from the string that best matches the pattern:

 

SELECT SUBSTRING('Company 2016' LIKE 'Company \"%\"' ESCAPE '\')
//returns '2016'

SELECT SUBSTRING('Company 2016' SIMILAR TO '% \"[0-9]{4}\"'ESCAPE'\')
//returns '2016'

SELECT SUBSTRING('217 Bath Road, Slough, Berkshire, SL1 4AA, UK' SIMILAR TO '%\"[A-Z]{1,2}[0-9]{1,2} [0-9][A-Z]{2}\"%' ESCAPE '\')
//returns 'SL1 4AA'

 

The point made here is that even with great additions such as these, which are very welcome and enhance the product much, there's still a limit in functionality. Hence calling external libraries that can do it better is highly desirable.

Now there's a thought - what about the OSS community adding a language extension that calls Perl to leverage its advanced regular expression functionality?

 

More Information

Microsoft SQL Server Language Extensions 

Microsoft SQL Server Language Extensions - Java regex tutorial
Open sourcing the Java language extension for SQL Server announcement

Related Articles

Ingres 11 Technical Preview

The Enduring Influence Of Postgres

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner


Google Updates Responsible AI Toolkit
01/11/2024

Google has announced updates to the Responsible Generative AI Toolkit to enable it to be used with any LLM model. The Responsible GenAI Toolkit provides resources to design, build, and evaluate open A [ ... ]



Prompt Engineering Techniques To Make You An Expert
18/11/2024

Introducing a GitHub repository full of hot tips and instructions on how to build the perfect prompt presented in a collection of Jupiter Notebooks.


More News

espbook

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Wednesday, 27 May 2020 )