Tuesday, February 14, 2012

Counting all the characters in a column

I'm a SQL newbie...'nuf said.

How do I use a SELECT statement to count all the characters in a given column? I would like to calcuate the total number of characters, the average length, etc.

I can use the LEN function to get a single row like this:

SELECTLEN(NAME1)

FROM INDEXINFO

WHERE NDX = 101

I played around with the COUNT function but that seems to be for counting rows. In VB I would use a FOR EACH clause and keep a running total for the result of the LEN function. Is this a problem better handled using a T-SQL script or can it be done with SELECT statement?

Thanks!

DeBug

Umm...I changed my key words in my Google search and figured out the following:

SELECTSUM(LEN(NAME1))FROM INDEXINFO

WHERE NDX < 100

Ok, this got me the total and I can of course change the WHERE clause but can I also get the AVE at the same time?

|||SELECT SUM(...), AVG(...) FROM INDEXINFO
WHERE NDX < 100|||

There are problems with using LEN(). See this thread for a recent discussion.

You could do the following:

Code Snippet


SELECT
ColumnLength = max( datalength( Name1 )),
ColumnAvg = avg( datalength( Name1 ))
FROM IndexInfo
WHERE NDX = 101

|||Thank you for your reply!|||

Hi Arnie,

You have to take in mind if the column is a unicode one, because for those, every character is 2 bytes.

AMB

|||

Arnie,

Thank you and Phil for helping me with this. Here is the final version:

SELECT

Name1 =sum(datalength( Name1 )),

Name2 =sum(datalength( name2 )),

Name3 =sum(datalength( name3 )),

Name4 =sum(datalength( name4 )),

Date =sum(datalength( date ))

FROM IndexInfo

SELECT

Name1 =avg(datalength( Name1 )),

Name2 =avg(datalength( name2 )),

Name3 =avg(datalength( name3 )),

Name4 =avg(datalength( name4 )),

Date =avg(datalength( date ))

FROM IndexInfo

btw, how do you do that code snippet?|||

I must have trailing spaces somewhere....I used both the len and datalength for the results to text:

Name1 Name2 Name3 Name4 Date

-- -- -- -- --

644653 1047334 59933 69820 888709

(1 row(s) affected)

Name1 Name2 Name3 Name4 Date

-- -- -- -- --

644597 1047332 59933 69820 888709

|||Look for the rows where the LEN() and DATALENGTH() don't match.|||

Great Reminder Alejandro,

DATALENGTH() reports the number of bytes used to store the field value, so when using DATALENGTH() with unicode (nchar(), nvarchar()) datatypes, it is necessary to divide by two if you are handling single byte characters.

|||

I wrapped the blanks in a delimiter to make it show a little better. I now need to review the data entry code to see why it let someone key all those blank spaces Sad

Thanks again to all that helped!

DeBug

SELECT NDX, MY_BLANKS = ('-->' + NAME1 + '<--') FROM INDEXINFO

WHERE DATALENGTH(NAME1) > LEN(NAME1);

NDX MY_BLANKS
-- --
263 --> <--
427 --> <--

(2 row(s) affected)

No comments:

Post a Comment