Anda di halaman 1dari 6

Cleaning Text

Ron de Bruin and Norman Harker, March 2005


Note: These formulas were developed for use with the DataRefiner Excel Addin.

Remove Excess Spaces, Substitutes " " for all CHAR(160), and non printing characters (especially CHAR(10)) and rem
Norman John
Harker
Norman John Harker
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A7),LEN(TRIM(A7
Norman
Harker!
Norman Harker
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A8),LEN(TRIM(A8
Norman
Harker?
Norman Harker
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A9),LEN(TRIM(A9
Norman
Harker.
Norman Harker
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A10),LEN(TRIM(A
Norman Harker?
Norman Harker
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A11),LEN(TRIM(A
Norman Harker
Norman Harker
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A12),LEN(TRIM(A
Norman John Ron Harker
Norman John Ron Harker =TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A13),LEN(TRIM(A
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A14),LEN(TRIM(A
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A15),LEN(TRIM(A
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A16),LEN(TRIM(A
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A17),LEN(TRIM(A
=TRIM(CLEAN(SUBSTITUTE(LEFT(TRIM(A18),LEN(TRIM(A
Formula Comments
Wow! 7 Text functions and a co-erced logical formula using an array structure in a calculation (of length)!
Look Mum! No VBA!
The above formula is a composite of the two formulas below.
Note that if used on a large data range, the result will be increased size of workbook and slower re-calculation.
In most cases this formula will be used once only and after use you should use Copy > Paste Special > Values > OK.
The source data can then be deleted or you can copy result over source data.

Just Remove Excess Spaces and Substitute " " for CHAR(160) and non-printing characters (especially CHAR(10))
Norman John
Harker
Norman John Harker
=TRIM(CLEAN(SUBSTITUTE(A30,CHAR(160)," ")))
Norman
Harker!
Norman Harker!
=TRIM(CLEAN(SUBSTITUTE(A31,CHAR(160)," ")))
Norman
Harker?
Norman Harker?
=TRIM(CLEAN(SUBSTITUTE(A32,CHAR(160)," ")))
Norman
Harker.
Norman Harker.
=TRIM(CLEAN(SUBSTITUTE(A33,CHAR(160)," ")))
Norman Harker?
Norman Harker?
=TRIM(CLEAN(SUBSTITUTE(A34,CHAR(160)," ")))
Norman Harker
Norman Harker
=TRIM(CLEAN(SUBSTITUTE(A35,CHAR(160)," ")))
Norman John Ron Harker
Norman John Ron Harker =TRIM(CLEAN(SUBSTITUTE(A36,CHAR(160)," ")))
=TRIM(CLEAN(SUBSTITUTE(A37,CHAR(160)," ")))
=TRIM(CLEAN(SUBSTITUTE(A38,CHAR(160)," ")))
=TRIM(CLEAN(SUBSTITUTE(A39,CHAR(160)," ")))
=TRIM(CLEAN(SUBSTITUTE(A40,CHAR(160)," ")))
=TRIM(CLEAN(SUBSTITUTE(A41,CHAR(160)," ")))

Formula Comments
The formula replaces the non breaking character space CHAR(160) by a space. This often causes trouble with data imported
CLEAN removes all non printed characters with the most common "culprit" in Excel being CHAR(10).
CHAR(10) is inserted when you use Alt-Enter to force a line wrap. It is not "seen" in the cell it is entered in but will appear as a
The logic of the formula is that we TRIM after CLEANing after SUBSTITUTE of CHAR(160).
Just Remove Terminating Punctuation and TRIM
Norman John
Norman John
Harker
Harker
Norman
Norman
Harker!
Harker
Norman
Norman
Harker?
Harker
Norman
Norman
Harker.
Harker
Norman Harker?
Norman Harker
Norman Harker
Norman Harker
Norman John Ron Harker
Norman John Ron Harker

=LEFT(TRIM(A50),LEN(TRIM(A50))-OR(RIGHT(TRIM(A50))=

=LEFT(TRIM(A51),LEN(TRIM(A51))-OR(RIGHT(TRIM(A51))=

=LEFT(TRIM(A52),LEN(TRIM(A52))-OR(RIGHT(TRIM(A52))=

=LEFT(TRIM(A53),LEN(TRIM(A53))-OR(RIGHT(TRIM(A53))=
=LEFT(TRIM(A54),LEN(TRIM(A54))-OR(RIGHT(TRIM(A54))=
=LEFT(TRIM(A55),LEN(TRIM(A55))-OR(RIGHT(TRIM(A55))=
=LEFT(TRIM(A56),LEN(TRIM(A56))-OR(RIGHT(TRIM(A56))=
=LEFT(TRIM(A57),LEN(TRIM(A57))-OR(RIGHT(TRIM(A57))=
=LEFT(TRIM(A58),LEN(TRIM(A58))-OR(RIGHT(TRIM(A58))=
=LEFT(TRIM(A59),LEN(TRIM(A59))-OR(RIGHT(TRIM(A59))=
=LEFT(TRIM(A60),LEN(TRIM(A60))-OR(RIGHT(TRIM(A60))=
=LEFT(TRIM(A61),LEN(TRIM(A61))-OR(RIGHT(TRIM(A61))=

Formula Comments
First look at how we determine if there are terminating punctuation characters.
Norman John
Harker
FALSE
=-OR(RIGHT(TRIM(A65))={"?","!","."})
Norman
Harker!
TRUE
=-OR(RIGHT(TRIM(A66))={"?","!","."})
Norman
Harker?
TRUE
=-OR(RIGHT(TRIM(A67))={"?","!","."})
Norman
Harker.
TRUE
=-OR(RIGHT(TRIM(A68))={"?","!","."})
Norman Harker?
TRUE
=-OR(RIGHT(TRIM(A69))={"?","!","."})
Norman Harker
FALSE
=-OR(RIGHT(TRIM(A70))={"?","!","."})
Norman John Ron Harker
FALSE
=-OR(RIGHT(TRIM(A71))={"?","!","."})
FALSE
=-OR(RIGHT(TRIM(A72))={"?","!","."})
FALSE
=-OR(RIGHT(TRIM(A73))={"?","!","."})
FALSE
=-OR(RIGHT(TRIM(A74))={"?","!","."})
FALSE
=-OR(RIGHT(TRIM(A75))={"?","!","."})
FALSE
=-OR(RIGHT(TRIM(A76))={"?","!","."})

Note the "-" before the OR. This forces a return of -1 for TRUE and 0 for FALSE.
We've used an internal array within the OR function to check for existence of "?", "!" or "." in the TRIMmed target cell.
We have to TRIM the target cell in this OR function and elsewhere in the main formula just in case some darned fool has put a
The use of the internal array structure allows us to cycle through the options efficiently and without a long OR function that tes
The OR function will normally return TRUE or FALSE but we negate the function and force to to return -1 or 0.
Forcing to -1 or 0 allows us to use this element to calculate the LENgth of the TRIMmed formula for use by the LEFT function.
This forcing return of logical expressions to is commonly used where addition, subtraction or multiplications by 1 or zero can se

Acknowledgements
Dave McRitchie's website at:
http://www.mvps.org/dmcritchie/excel/strings.htm
A general resource and information repository on string manipulation.
We highly recommend a visit to Dave's web site Index at:
http://www.mvps.org/dmcritchie/excel/xlindex.htm

Ron de Bruin
Norman Harker
March 2005

Further References
See especially the DataRefiner Add
http://www.rondebruin.nl/win/addins
Also visit Ron's Home Page for links
http://www.rondebruin.nl/index.html
And if you like these tips, you might
http://www.rondebruin.nl/tips.htm

cially CHAR(10)) and removes ".", "?" and "!" from end.

T(TRIM(A7),LEN(TRIM(A7))-OR(RIGHT(TRIM(A7))={"?","!","."})),CHAR(160)," ")))

T(TRIM(A8),LEN(TRIM(A8))-OR(RIGHT(TRIM(A8))={"?","!","."})),CHAR(160)," ")))

T(TRIM(A9),LEN(TRIM(A9))-OR(RIGHT(TRIM(A9))={"?","!","."})),CHAR(160)," ")))

T(TRIM(A10),LEN(TRIM(A10))-OR(RIGHT(TRIM(A10))={"?","!","."})),CHAR(160)," ")))
T(TRIM(A11),LEN(TRIM(A11))-OR(RIGHT(TRIM(A11))={"?","!","."})),CHAR(160)," ")))
T(TRIM(A12),LEN(TRIM(A12))-OR(RIGHT(TRIM(A12))={"?","!","."})),CHAR(160)," ")))
T(TRIM(A13),LEN(TRIM(A13))-OR(RIGHT(TRIM(A13))={"?","!","."})),CHAR(160)," ")))
T(TRIM(A14),LEN(TRIM(A14))-OR(RIGHT(TRIM(A14))={"?","!","."})),CHAR(160)," ")))
T(TRIM(A15),LEN(TRIM(A15))-OR(RIGHT(TRIM(A15))={"?","!","."})),CHAR(160)," ")))
T(TRIM(A16),LEN(TRIM(A16))-OR(RIGHT(TRIM(A16))={"?","!","."})),CHAR(160)," ")))
T(TRIM(A17),LEN(TRIM(A17))-OR(RIGHT(TRIM(A17))={"?","!","."})),CHAR(160)," ")))
T(TRIM(A18),LEN(TRIM(A18))-OR(RIGHT(TRIM(A18))={"?","!","."})),CHAR(160)," ")))

al > Values > OK.

specially CHAR(10))

CHAR(160)," ")))

CHAR(160)," ")))

CHAR(160)," ")))

CHAR(160)," ")))
CHAR(160)," ")))
CHAR(160)," ")))
CHAR(160)," ")))
CHAR(160)," ")))
CHAR(160)," ")))
CHAR(160)," ")))
CHAR(160)," ")))
CHAR(160)," ")))

rouble with data imported from HTML sources.

red in but will appear as a box in cells that reference it.

)-OR(RIGHT(TRIM(A50))={"?","!","."}))

)-OR(RIGHT(TRIM(A51))={"?","!","."}))

)-OR(RIGHT(TRIM(A52))={"?","!","."}))

)-OR(RIGHT(TRIM(A53))={"?","!","."}))
)-OR(RIGHT(TRIM(A54))={"?","!","."}))
)-OR(RIGHT(TRIM(A55))={"?","!","."}))
)-OR(RIGHT(TRIM(A56))={"?","!","."}))
)-OR(RIGHT(TRIM(A57))={"?","!","."}))
)-OR(RIGHT(TRIM(A58))={"?","!","."}))
)-OR(RIGHT(TRIM(A59))={"?","!","."}))
)-OR(RIGHT(TRIM(A60))={"?","!","."}))
)-OR(RIGHT(TRIM(A61))={"?","!","."}))

Mmed target cell.


ome darned fool has put a space after the punctuation.
long OR function that tests different values of the same parameter.

use by the LEFT function.


cations by 1 or zero can serve a given objective.

cially the DataRefiner Addin downloadable free from:


w.rondebruin.nl/win/addins/sectionaddins.htm
Ron's Home Page for links to other addins:
w.rondebruin.nl/index.html
like these tips, you might be interested in a lot more at:
w.rondebruin.nl/tips.htm

Anda mungkin juga menyukai