How to remove all whitespace from a string?

Clash Royale CLAN TAG#URR8PPP
How to remove all whitespace from a string?
So " xx yy 11 22 33 " will become "xxyy112233". How can I achieve this?
" xx yy 11 22 33 "
"xxyy112233"
7 Answers
7
In general, we want a solution that is vectorised, so here's a better test example:
whitespace <- " tnrvf" # space, tab, newline,
# carriage return, vertical tab, form feed
x <- c(
" x y ", # spaces before, after and in between
" u2190 u2192 ", # contains unicode chars
paste0( # varied whitespace
whitespace,
"x",
whitespace,
"y",
whitespace,
collapse = ""
),
NA # missing
)
## [1] " x y "
## [2] " ← → "
## [3] " tnrvfx tnrvfy tnrvf"
## [4] NA
gsub
gsub replaces all instances of a string (fixed = TRUE) or regular expression (fixed = FALSE, the default) with another string. To remove all spaces, use:
gsub
fixed = TRUE
fixed = FALSE
gsub(" ", "", x, fixed = TRUE)
## [1] "xy" "←→"
## [3] "tnrvfxtnrvfytnrvf" NA
As DWin noted, in this case fixed = TRUE isn't necessary but provides slightly better performance since matching a fixed string is faster than matching a regular expression.
fixed = TRUE
If you want to remove all types of whitespace, use:
gsub("[[:space:]]", "", x) # note the double square brackets
## [1] "xy" "←→" "xy" NA
gsub("\s", "", x) # same; note the double backslash
library(regex)
gsub(space(), "", x) # same
"[:space:]" is an R-specific regular expression group matching all space characters. s is a language-independent regular-expression that does the same thing.
"[:space:]"
s
stringr
str_replace_all
str_trim
stringr provides more human-readable wrappers around the base R functions (though as of Dec 2014, the development version has a branch built on top of stringi, mentioned below). The equivalents of the above commands, using [str_replace_all][3], are:
stringr
stringi
str_replace_all][3]
library(stringr)
str_replace_all(x, fixed(" "), "")
str_replace_all(x, space(), "")
stringr also has a str_trim function which removes only leading and trailing whitespace.
stringr
str_trim
str_trim(x)
## [1] "x y" "← →" "x tnrvfy" NA
str_trim(x, "left")
## [1] "x y " "← → "
## [3] "x tnrvfy tnrvf" NA
str_trim(x, "right")
## [1] " x y" " ← →"
## [3] " tnrvfx tnrvfy" NA
stringi
stri_replace_all_charclass
stri_trim
stringi is built upon the platform-independent ICU library, and has an extensive set of string manipulation functions. The equivalents of the above are:
stringi
library(stringi)
stri_replace_all_fixed(x, " ", "")
stri_replace_all_charclass(x, "\pWHITE_SPACE", "")
Here "\pWHITE_SPACE" is an alternate syntax for the set of Unicode code points considered to be whitespace, equivalent to "[[:space:]]", "\s" and space(). For more complex regular expression replacements, there is also stri_replace_all_regex.
"\pWHITE_SPACE"
"[[:space:]]"
"\s"
space()
stri_replace_all_regex
stringi also has trim functions.
stringi
stri_trim(x)
stri_trim_both(x) # same
stri_trim(x, "left")
stri_trim_left(x) # same
stri_trim(x, "right")
stri_trim_right(x) # same
@DWin Supposedly it is faster if R knows that it does not have to invoke the regular expression stuff. In this case it does not really make any difference, I am just in the habit of doing so.
– Aniko
May 13 '11 at 13:00
Is there a difference between
"[[:space:]]" and "\s"?– Sacha Epskamp
May 13 '11 at 13:56
"[[:space:]]"
"\s"
if you check on flyordie.sin.khk.be/2011/05/04/day-35-replacing-characters or just type in ?regex then you see that [:space:] is used for "Space characters: tab, newline, vertical tab, form feed, carriage return, and space." That's a lot more than space alone
– Sir Ksilem
May 13 '11 at 14:25
@Aniko Hope you don't mind about the big edit. Since this question is highly popular, it looked like the answer needed to be more thorough.
– Richie Cotton
Dec 31 '14 at 10:52
I just learned about the "stringr" package to remove white space from the beginning and end of a string with str_trim( , side="both") but it also has a replacement function so that:
a <- " xx yy 11 22 33 "
str_replace_all(string=a, pattern=" ", repl="")
[1] "xxyy112233"
stringr package doesn't work well with every encoding. stringi package is better solution, for more info check github.com/Rexamine/stringi
– bartektartanus
Feb 20 '14 at 10:59
Please note that soultions written above removes only space. If you want also to remove tab or new line use stri_replace_all_charclass from stringi package.
stri_replace_all_charclass
stringi
library(stringi)
stri_replace_all_charclass(" ala t ma n kota ", "\pWHITE_SPACE", "")
## [1] "alamakota"
stringi package is on CRAN now, enjoy! :)– bartektartanus
Mar 15 '14 at 13:12
stringi
This command above is incorrect. The right way is stri_replace_all_charclass(" ala t ma n kota ", "\pWHITE_SPACE", "")
– Lucas Fortini
Aug 7 '14 at 2:10
After using
stringi for a few months now and seen/learned how powerful and efficient it is, it has become my go-to package for string operations. You guys did an awesome job with it.– Rich Scriven
Dec 29 '14 at 1:41
stringi
Use [[:blank:]] to match any kind of horizontal white_space characters.
[[:blank:]]
gsub("[[:blank:]]", "", " xx yy 11 22 33 ")
# [1] "xxyy112233"
x = "xx yy 11 22 33"
gsub(" ", "", x)
[1] "xxyy112233"
The function str_squish() from package stringr of tidyverse does the magic!
str_squish()
stringr
library(dplyr)
library(stringr)
df <- data.frame(a = c(" aZe aze s", "wxc s aze "),
b = c(" 12 12 ", "34e e4 "),
stringsAsFactors = FALSE)
df <- df %>%
rowwise() %>%
mutate_all(funs(str_squish(.))) %>%
ungroup()
df
# A tibble: 2 x 2
a b
<chr> <chr>
1 aZe aze s 12 12
2 wxc s aze 34e e4
Please do not link to code. Add it in the text body of your answer and explain it here, to give your answer more longterm value.
– R Balasubramanian
Aug 7 at 14:04
While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review
– Rui Barradas
Aug 7 at 16:50
Thanks @RBalasubramanian for reminding me of this guideline. I will follow it in the future.
– damianooldoni
Aug 9 at 8:26
I don't see how this answers the question.
str_squish doesn't remove all spaces. It just trims and substitutes multiple spaces for one.– Nettle
Aug 16 at 21:01
str_squish
Try this:
remove fill blank.
2. 1.
| |
V V
display subinstr(stritrim(" xx yy 11 22 33 "), " ", "",.)
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
@Aniko. Is there a reason you used fixed=TRUE?
– 42-
May 13 '11 at 12:57