Parse a Review to Break It Up Into Sentences

Past:   |   Updated: 2020-02-xix   |   Comments (13)   |   Related: More > TSQL


Protecting your data with data anonymization

Free MSSQLTips Webinar: Protecting your data with data anonymization

Data protection legislation guidelines pass up the utilise of production data in non-production environments, only it is very difficult to anonymize data with native SQL Server tools, and then this best practice is not frequently followed. Some organizations elect to simply redact or goose egg vast arrays of content which makes the resulting database unusable. This session volition flow through the full anonymization cycle from get-go to finish within the development life cycle.


Problem

Typically, in SQL Server data tables and views, values such equally a person's name or their address is stored in either a concatenated string or as individual columns for each part of the whole value. For case: John Smith 123 Happy St Labor Boondocks, CA. This data could be stored in a single cavalcade with all the data, in two columns separating the person's name from their address or in multiple columns with a column for each piece of the total value of the data.

With this, DBA'due south are inevitably always having to concatenate or parse the values to arrange our customer's needs.

To build on our sample from above, I accept an address column that has the full accost (street number and name, city and state) separated by commas in a concatenated column. I want to break that information into three columns (street, city and state) respectively.

Solution

For this tip, let'southward create a examination database and exam table for the parsing instance. In this scenario, nosotros have a street proper name and number, followed by the metropolis, then the state.

USE master; GO  CREATE DATABASE myTestDB; GO  USE myTestDB; Become          

The post-obit code creates a table in the new (exam) database with simply two columns - an id column and an address cavalcade.

CREATE Tabular array dbo.custAddress(      colID INT IDENTITY PRIMARY Key    , myAddress VARCHAR(200)    ); GO          

Populate the table with some generic data. Notice that the address is one string value in one column.

INSERT INTO dbo.custAddress(myAddress) VALUES('7890 � 20th Ave E Apt 2A, Seattle, VA')     , ('9012 W Capital Way, Tacoma, CA')     , ('5678 Old Redmond Rd, Fletcher, OK')     , ('3456 Coventry House Miner Rd, Richmond, TX') GO          

Confirm the tabular array creation and data entry with the following SQL SELECT argument.

SELECT * FROM dbo.custAddress; GO          

Your results should return, as expected, two columns - the id column and the accost column. Find the address column is returned as a single string value in prototype 1.

query results

Image one

Breaking downward the data into individual columns. The next pace will be to parse out the three individual parts of the address that are separated by a comma.

SELECT       REVERSE(PARSENAME(Supersede(Opposite(myAddress), ',', '.'), 1)) AS [Street]    , Reverse(PARSENAME(Supercede(Reverse(myAddress), ',', '.'), 2)) Equally [City]    , REVERSE(PARSENAME(Supplant(REVERSE(myAddress), ',', '.'), 3)) Equally [Land] FROM dbo.custAddress; Get          

Since we didn't call the id column in the SELECT argument above, the results only returned the address string divided into three columns, each cavalcade returning a portion of the address for the corresponding column as shown in prototype two.

query results

Image two

Here in the real globe, DBA's are often faced with more circuitous tables or views and not just a simple two column table as in the above sample. Although the sample in a higher place is a great primer for dissecting how to parse a string value, the following department demonstrates a more than complex situation.

Again, create a sample table in our examination database to work with. Hither, nosotros create a slightly more complex tabular array with some additional columns.

CREATE TABLE employeeData(      colID INT IDENTITY PRIMARY Cardinal    , empID INT    , empName VARCHAR(50)    , empAddress VARCHAR(200)    , empPhone VARCHAR(12)    , jobClass VARCHAR(50)    ); GO          

Insert some generic data into the examination table.

INSERT INTO employeeData(empID, empName, empAddress, empPhone, jobClass) VALUES (i, 'John, M, Smith, Jr', '123 Happy Hollow, Barnyard, OK, 90294', '202-555-0118', 'Developer')      , (2, 'Joe, South, Jones, Sr', '456 Sad Ln, BarnDoor, TX, 90295', '202-555-0195', 'Tester')      , (three, 'Sammy, L, Smuthers', '5655 Medow Lane, Pastuer, CA, 90296', '202-555-0192', 'Sales')       , (4, 'Henry, R, Lakes, Esq', '8749 Sunshine Park, Glenndale, HA, 90297', '202-555-0141', 'Director')      , (5, 'Harry, Q, Public, Jr', '555 Somber Ln, Levy, OR, 90298', '202-555-0137', 'Graphical Designer') Become

Now, let'southward create a mixed SELECT argument to pull all the columns that breaks downward the "empName", "empAddress" and "empPhone" columns into multiple columns.

In the post-obit block of code, find that I restarted my "place / position" count on each column that I want to parse. Each time you start parsing a new column, yous must starting time your count over. You should also note that the "empName" and "empAddress" string values are separated with a comma while the "empPhone" string value is separated by a hyphen and thus should be reflected in your "parse" function between the showtime set up of single quotes.

SELECT     colID    , empID    -- The post-obit section breaks down the "empName" column into three columns.    , Opposite(PARSENAME(REPLACE(REVERSE(empName), ',', '.'), 1)) Every bit FirstName    , REVERSE(PARSENAME(REPLACE(Reverse(empName), ',', '.'), 2)) Equally MiddleName    , REVERSE(PARSENAME(REPLACE(Reverse(empName), ',', '.'), 3)) AS LastName    -- The post-obit section breaks down the "empAddress" column into four columns.    , REVERSE(PARSENAME(Supercede(REVERSE(empAddress), ',', '.'), ane)) AS Street    , REVERSE(PARSENAME(REPLACE(Reverse(empAddress), ',', '.'), 2)) Every bit City    , REVERSE(PARSENAME(REPLACE(Contrary(empAddress), ',', '.'), 3)) Every bit State    , Contrary(PARSENAME(REPLACE(REVERSE(empAddress), ',', '.'), iv)) Equally ZipCode    -- The following section breaks downwardly the "empPhone" column into 3 columns    , Reverse(PARSENAME(REPLACE(Reverse(empPhone), '-', '.'), 1)) AS AreaCode    , REVERSE(PARSENAME(Supervene upon(Opposite(empPhone), '-', '.'), ii)) AS Prefix    , Contrary(PARSENAME(Supersede(Contrary(empPhone), '-', '.'), iii)) Every bit LastFour    , jobClass FROM employeeData; Go          

The results should return xiii columns from the six columns queried confronting every bit shown in image iii.

query results

Image 3

In summary, the PARSENAME part is a handy addition to your T-SQL toolkit for writing queries involving delimited data. It allows for parsing out and returning individual segments of a string value into dissever columns. Since the PARSENAME office breaks downwards the string, you are not obligated to render all the delimited values. As in our sample above, you could have returned simply the area lawmaking from the "empPhone" column to filter certain expanse codes in your search.

Next Steps
  • CONCAT()
  • String Concatenation +
  • Bandage and CONVERT

Related Articles

Popular Articles

About the writer

MSSQLTips author Aubrey Love Aubrey Dearest has been a Database Ambassador for nigh 8 years and is currently working every bit a Microsoft SQL Server Business concern Intelligence specialist.

View all my tips

Article Last Updated: 2020-02-nineteen

gagerituder64.blogspot.com

Source: https://www.mssqltips.com/sqlservertip/6321/split-delimited-string-into-columns-in-sql-server-with-parsename/

0 Response to "Parse a Review to Break It Up Into Sentences"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel