DEV Community

Cover image for Substring from a Column of Strings — From SQL to SPL #25
1 1 1 1 1

Substring from a Column of Strings — From SQL to SPL #25

Problem description & analysis:

The database table tbl has a string field DESCRIPTION.

source table

Task: Now we need to retrieve the word ‘EN’ and the subsequent string of numbers from the DESCRIPTION field. The string of numbers may consist entirely of digits, such as ‘10204’, or it may contain special characters, such as ‘10277/10’. Caution: Do not retrieve punctuation marks; If the string does not contain ‘EN’, return null.

expected results

Code comparisons:

SQL

DECLARE @separator CHAR(1) = SPACE(1);
SELECT *
    , REPLACE(c.query('
          for $x in /root/r[text()="EN"]
          let $pos := count(root/r[. << $x]) + 1
          return if (xs:int(substring((/root/r[$pos + 1]/text())[1],1,5)) instance of xs:int) then
                data(/root/r[position()=($pos, $pos + 1)])
            else data(/root/r[$pos])
        ').value('text()[1]', 'NVARCHAR(30)')
        ,',', '') AS NORM
FROM @tbl AS t
CROSS APPLY (SELECT PATINDEX('%EN[0-9][0-9][0-9][0-9][0-9]%', DESCRIPTION)) AS t2(pos)  -- to handle Exception cases
CROSS APPLY (SELECT TRY_CAST('<root><r><![CDATA[' + 
    REPLACE(IIF(pos > 0, STUFF(DESCRIPTION,pos + 2,0, SPACE(1)), DESCRIPTION), @separator, ']]></r><r><![CDATA[') + 
    --REPLACE(DESCRIPTION, @separator, ']]></r><r><![CDATA[') + 
    ']]></r></root>' AS XML)) AS t1(c)
ORDER BY ID;
Enter fullscreen mode Exit fullscreen mode

SQL can implement it using CROSS APPLY with XML syntax, but the code is complex and lengthy. Using regular expressions can also implement it, but the code is more difficult to understand.

SPL: SPL provides string-splitting functions and ordered calculation functions, with simple and easy-to-understand code:

esProc SPL code

A1: Load data.

A2: Find the substring after “EN “ from the large string, split it into a set by character, extract the preceding numeric members, as well as special characters such as +-%_ in the numeric string, and merge them into a numeric string. The select function is used for filtering, @c represents taking from the first member onward until the first member that makes the condition false. The pos function returns the position of the substring.


Free to Try, Powerful to Use — esProc SPL FREE Download.

Heroku

Built for developers, by developers.

Whether you're building a simple prototype or a business-critical product, Heroku's fully-managed platform gives you the simplest path to delivering apps quickly — using the tools and languages you already love!

Learn More

Top comments (1)

Collapse
 
judith677 profile image
Judith-Data-Processing-Hacks

Want to share your feedback or dive into more data processing solutions?
Feel free to join us!

💭Discord
💭Reddit

ACI image

ACI.dev: Fully Open-source AI Agent Tool-Use Infra (Composio Alternative)

100% open-source tool-use platform (backend, dev portal, integration library, SDK/MCP) that connects your AI agents to 600+ tools with multi-tenant auth, granular permissions, and access through direct function calling or a unified MCP server.

Check out our GitHub!