You can use the same logic to extract pattern matched strings from files, network shares etc.
1. define an IEnumerable static method with SqlFunction attribute.
2. define a GetRow method to return a single row for the tvf. Sql CLR will call this method to populate the tvf.
3. use a case-insensitive compare method to remove duplicates. You can either omit this method or use a DISTINCT clause in the select statement in Sql Server query.
[SqlFunction(FillRowMethodName = "GetRow", DataAccess = DataAccessKind.Read, TableDefinition = "found varchar(8000)")]I am implementing this functionality as a table value function (tvf) so that we can use this to join with other tables or tvfs.
public static IEnumerable GetRegExSearch([SqlFacet(MaxSize = -1)]SqlString text, SqlString regExSearchFor)
{
ListallFound = new List ();
MatchCollection matchCollection = Regex.Matches(text.ToString(), regExSearchFor.ToString());
for (int i = 0; i < matchCollection.Count; ++i)
{
String found = matchCollection[i].Value.ToString();
if (FoundInList(found, allFound) == false)
allFound.Add(found);
}
return allFound;
}
public static void GetRow(Object obj, out String found)
{
found = obj.ToString();
}
To register this assemby in Sql Server, use the deploy option from Visual Studio or copy the dll to sql server and execute
CREATE ASSEMBLY SqlCLR
FROM 'C:\SqlCLR.dll'
GO
CREATE FUNCTION [dbo].[fxGetRegExSearch](@text [nvarchar](max), @regExSearchFor [nvarchar](4000))
RETURNS TABLE (
[found] [nvarchar](max) NULL
) WITH EXECUTE AS CALLER
AS
EXTERNAL NAME [SqlCLR].[Functions].[fxGetRegExSearch]
download code
-paul