Back to all posts

Using Regular Expressions (Part 2)

Posted on Dec 12, 2006

Posted in category:
Development
C#

This is the second part of my Regular Expression overview.  In the first article I discussed the basics of creating Regular Expressions and provided a link to test expression patterns using the .NET Framework RegEx classes.  In this article I will discuss the various actions that can be taken to match values in a string using regular expressions.  I will also discuss how you can implement a SQL CLR UDF to allow regular expression validation from your database to provide a strong level of input validation at the database level.

Below I will start by discussing the use of the RegEx classes in standard .NET applications.  My examples will all be based off of a C#.NET console application designed to display the results of expression testing.  You can expand these examples to apply to other program types.  Please note that for all examples you must add a "using System.Text.RegularExpressions" statement to your code to be granted direct access to the RegEx classes.  At the bottom of this article, you will also find a download link to obtain sample projects for each of my examples.

Regular Expressions - Standard .NET Application Examples

Using bool Regex.IsMatch(string input, string pattern)

The first Regular Expression match option I will provide you with is the static method Regex.IsMatch(string input, string expression).  This method provides you a quick way to receive a boolean result regarding a match between a regular expression and the input string provided by the user.  The method signature in its simplest form is provided above.  Using this overload of the "IsMatch" function you use the default regular expression options and receive a boolean value indicating the success or failure of a match

Below you will see the code required to receive input from the user, and to test the input value for a match based on the regular expression.

Example of IsMatch
//Prompt the user, 2 separate input items (inputRegEx and inputMatchText)
Console.WriteLine("Welcome to the Regular Expression demonstrator!");
Console.Write("Please enter a regular expression string:");
string inputRegEx = Console.ReadLine();
Console.Write("Please enter a test string for matching:");
string inputMatchText = Console.ReadLine();

//Perform the match test, then output the result
bool isDirectMatch = Regex.IsMatch(inputMatchText, inputRegEx);
Console.WriteLine("Result of Regex.IsMatch(string input, string expression): " + isDirectMatch.ToString());

Using this method I performed a test on the expression "b\w*" (without the quotes), this is to match a string that contains the letter b followed by zero or more word characters.

Value Result
billy true
billy777 true
b true
Billy false
aaaBill false

The results reported above are to be expected for a few reasons.  First of all by default Regular Expression matches are case sensitive, therefore my above match would ONLY work on an input string that contained a lowercase b.  At times you want to validate that a string contains a particular string but you don't care if it is an upper or a lower case letter.  You could modify your expression to be "[bB]\w*" to allow an upper or lower case letter b, however this can add an extra level of confusion to your expression.  The .NET Framework provides a "RegexOptions" enumeration that you can utilize to provide additional options when matching expressions.  We will discuss using this in our next portion.

Using bool Regex.IsMatch(string input, string pattern, RegexOptions options)

This method allows us to use the "RegexOptions" enumeration to specify a specific option or bit switched option set.  For the case of this article we will only discuss the "IgnoreCase" RegexOption, however, to perform further research on the available options please see the following MSDN article. http://msdn2.microsoft.com/en-us/library/system.text.regularexpressions.regexoptions.aspx.  Below you will see the example code to perform a case insensitive match on an expression text

Example Regex.IsMatch(input, pattern, options)
//Perform the match test, using the ignore case option
bool isIgnoreCaseMatch = Regex.IsMatch(inputMatchText, inputRegEx, RegexOptions.IgnoreCase);
Console.WriteLine("Result of RegEx.IsMatch(string input, string expression, RegexOptions options): "
                    + isIgnoreCaseMatch);

The change to this example is very simple yet the effect on the valid matches is great.  Below you will see the truth table for the same examples used in the first test of the article. The same regular expression text was used; the only difference was that "RegexOptions.IgnoreCase" was provided

Value Result
billy true
billy777 true
b true
Billy true
aaaBill true

This shows you the power of using the regular expression options; you can provide more flexibility in your validation system. Now, using the "Regex.IsMatch" method is a great tool but it will not always suit your needs, there are times you need to find how many matches there are for a specified string among other complex validations. We will discuss some other options available in the following section.

Using MatchCollection Regex.Matches(string input, string pattern)

The Regex.Matches method provides you with a facility to check a string for multiple matches of expression within the string. This can be helpful when validating license keys or other types of input that might have multiple occurrences of the same pattern. The inputs for this method are the exact same as for the IsMatch function, however, this method returns a MatchCollection object to show you the collection of matches. The MatchCollection will contain one Match object for each successful match. The Match object provides methods to receive information about the match. The most helpful methods are Index, Length, and value. Index is the index of the starting match character in the input string; this allows you to extract the match if needed. The Length value is the number of characters contained in the match and the Value is the actual match string.

Below you will see the code required to attempt a multiple matches and then output the results of the match. 

Example MatchCollection.Regex.Matches
//Perform Matches() test
MatchCollection oMatches = Regex.Matches(inputMatchText, inputRegEx);
Console.WriteLine("Matches found using Regex.Matches(string input, string pattern): " 
                    + oMatches.Count.ToString());

Console.WriteLine("Match Detail, if appliciable");

//Loop through the collection, this will be skipped if no match
foreach (Match oMatch in oMatches)
{
    Console.WriteLine("Match Index: " + oMatch.Index.ToString());
    Console.WriteLine("Match Length: " + oMatch.Length.ToString());
    Console.WriteLine("Match Value: " + oMatch.Value);
    Console.WriteLine("");

}

Standard Usage Summary

Using the above examples should help individuals get started with Regular Expression validation in .NET.  The .NET Framework provides many methods and classes to validate and work with Regular Expressions and this article has only scratched the surface however it serves as a great starting point.  Please see the below section to see how to create a CLR UDF to validate Zip Code Input!

Regular Expresions in SQL CLR User Defined Functions

A place where Regular Expression validation can become very handy is in SQL Server 2005 CLR User Defined Functions and Stored Procedures.  Prior to the ability to use CLR Functions and procedures in SQL Server it was very cumbersome to implement sophisticated string validation at the database level.  Now with the SQL CLR functionality you are able to quickly build procedures that can be used to validate input on SQL Server.  Below you will see an example of how to create a CLR User Defined Function to validate a Zip Code based on the expression we created in part one of this article series.

First prior to building this example function we must ensure that CLR Integration is enabled on your specific database.  To validate this you may run the following script to enable CLR Integration

Enable CLR Integration
sp_configure 'clr enabled', 1
GO
RECONFIGURE
GO

Once this has been completed you will want to create a new "SqlServer" project.  You can create this project by selecting "New Project" from the "File" menu in Visual Studio.  You will find the "SqlServer" project type under "Visual C#" -> "Database" -> "SqlServer".  (NOTE: you may also create the UDF in Visual Basic, by selecting the SqlServer project type from the Visual Basic project listing)  When you create the project it will request that you provide it a link to your SQL Server.  This is needed for the automatic deployment and configuration of your stored procedure.

Once your project has been created you will want to right-click on the project and select "Add" -> "User-Defined Function", you will be then asked to give it a name.  In our case we will call it "ValidateZip.cs" to keep the name short and simple.  Visual Studio will then provide you a shell to place the code for your validation method.  In our case we will want to be sure to set the return type to "bool" as it is simply a yes or no answer.  We will also want to ensure that an input string value was provided.  Since a zip code is an all or nothing validation we will use the "Regex.IsMatch" method with a specific validation string to provide the result to the calling user.  Below is the completed code to validate the zip code.  NOTE: all User-Defined functions intended for use in SQL Server must be declared as public and static!

Example Function
[Microsoft.SqlServer.Server.SqlFunction]
public static bool ValidateZip(string input)
{
    //Declare our expression
    string expression = @"^\d{5}(-\d{4})?$";

    //Test the input and return the result
    return Regex.IsMatch(input, expression);
}

Now that you have the function built you can simply right-click on your project and select "Deploy". Visual Studio will then register your function with the SQL Server and you can now freely use this validation function in your SQL Queries. Below is a sample SQL Query to retrieve the validation result for a local Des Moines, Iowa zip code. If successful validation occurs a 1 will be returned, if unsuccessful a zero is returned.

select dbo.ValidateZip('50320')

Summary

This article shows you the basics of Regular Expression validation in .NET as well as how to incorporate regular expression validation into new SQL Server CLR User Defined Functions.  This should serve as a great starting point for understanding the various methods to implement regular expression validation in your new and existing projects.  Below you will find a link to a zip document with my two sample projects and the sample code used in this article.  Please feel free to review this code and let me know any questions you might have.

Sample Source Code