How To Extract Serialized Data From Strings In One Go In T-SQL

How To Extract Serialized Data From Strings In One Go In T-SQL

Consider the following data:

DECLARE @OrderString VARCHAR(MAX) = '1=1;|2=|3=|4=1;|5=|';

This data was being returned in an externally generated XML file, so out of the control of the developer.

Each element of information is delimited by a pipe |. Within each element, the first number, to the left the equal sign =, represented an Item Number. The second number, to the right of the equal sign, represented an Order Quantity. This could be any number, and if empty, it meant zero.

The developer needed to return this data in form of a table. He also needed to return only the rows with an Item Quantity greater than zero.

In T-SQL, this conversion can be done in one go, using a single query. Let’s go through the steps.

Remove Unnecessary Characters

First of all, if take a look at the source data, we can clearly see an oddity there. The semi-colon character seems to appear only after and when there is a declared Item Value. This is extra information we don’t need, as we already have each block delimited by the pipe character. So let’s take this out of the equation:

WITH Step0 AS
(
	SELECT
		String = REPLACE(@OrderString, ';', '')
),

Identify Elements

Now, to make our lives easier, let’s split each element, or, “block” into a separate row. This can be accomplished with a good old recursive CTE:

Step1 AS
(
	SELECT
		Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
		String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
	FROM
		Step0

	UNION ALL

	SELECT
		Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
		String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
	FROM
		Step1
	WHERE
		LEN(String) > 0
),

Identify Columns

Now that we have a row identified for each block, we can then extract the columns themselves:

Step2 AS
(
	SELECT
		ItemNumber = SUBSTRING(Block, 1, CHARINDEX('=', Block) - 1),
		ItemValue = SUBSTRING(Block, CHARINDEX('=', Block) + 1, LEN(Block))
	FROM
		Step1
),

Convert & Filter

Finally, now with proper columns in place, we can filter out the zero quantity items:

Step3 AS
(
	SELECT
		ItemNumber = CAST(ItemNumber AS INT),
		ItemValue = CAST(ItemValue AS INT)
	FROM
		Step2
	WHERE
		CAST(ItemValue AS INT) > 0
)

SELECT
	*
FROM
	Step3;

And that was it. Nothing to it.

Full Code

DECLARE @OrderString VARCHAR(MAX) = '1=1;|2=|3=|4=1;|5=|';

WITH Step0 AS
(
	SELECT
		String = REPLACE(@OrderString, ';', '')
),
Step1 AS
(
	SELECT
		Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
		String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
	FROM
		Step0

	UNION ALL

	SELECT
		Block = SUBSTRING(String, 1, CHARINDEX('|', String) - 1),
		String = SUBSTRING(String, CHARINDEX('|', String) + 1, LEN(String))
	FROM
		Step1
	WHERE
		LEN(String) > 0
),

Step2 AS
(
	SELECT
		ItemNumber = SUBSTRING(Block, 1, CHARINDEX('=', Block) - 1),
		ItemValue = SUBSTRING(Block, CHARINDEX('=', Block) + 1, LEN(Block))
	FROM
		Step1
),

Step3 AS
(
	SELECT
		ItemNumber = CAST(ItemNumber AS INT),
		ItemValue = CAST(ItemValue AS INT)
	FROM
		Step2
	WHERE
		CAST(ItemValue AS INT) > 0
)

SELECT
	*
FROM
	Step3;
Jorge Candeias's Picture

About Jorge Candeias

Jorge helps organizations build high-performing solutions on the Microsoft tech stack.

London, United Kingdom https://jorgecandeias.github.io