AsciiToInt {sfsmisc} | R Documentation |
AsciiToInt
returns integer
codes in 0:255
for each (one byte) character in strings
. ichar
is an
alias for it, for old S compatibility.
strcodes
implements in R the basic engine for translating
characters to corresponding integer codes.
chars8bit()
is the inverse function of
AsciiToint
, producing “one byte” characters from integer
codes. Note that it (and hence strcodes()
depends on the
locale, see Sys.getlocale()
.
AsciiToInt(strings) ichar(strings) chars8bit(i = 1:255) strcodes(x, table = chars8bit(1:255))
strings, x |
|
i |
numeric (integer) vector of values in |
table |
a vector of (unique) character strings, typically of one character each. |
Only codes in 1:127
make up the ASCII encoding which should be
identical for all R versions, whereas the ‘upper’ half
is often determined from the ISO-8859-1 (aka “ISO-Latin 1)”
encoding, but may well differ, depending on the locale setting, see
also Sys.setlocale
.
Note that 0
is no longer allowed since, R does not allow
\0
aka nul
characters in a string anymore.
AsciiToInt
(and hence ichar
) and chars8bit
return a
vector of the same length as their argument.
strcodes(x, tab)
returns a list
of the same
length
and names
as x
with list
components of integer vectors with codes in 1:255
.
Martin Maechler, partly in 1991 for S-plus
chars8bit(65:70)#-> "A" "B" .. "F" stopifnot(identical(LETTERS, chars8bit(65:90)), identical(AsciiToInt(LETTERS), 65:90)) ## may only work in ISO-latin1 locale (not in UTF-8): try( strcodes(c(a= "ABC", ch="1234", place = "Zürich")) ) ## in "latin-1" gives {otherwise should give NA instead of 252}: ## Not run: $a [1] 65 66 67 $ch [1] 49 50 51 52 $place [1] 90 252 114 105 99 104 ## End(Not run) myloc <- Sys.getlocale() if(.Platform $ OS.type == "unix") { # ``should work'' here try( Sys.setlocale(locale = "de_CH") )# "try": just in case print(strcodes(c(a= "ABC", ch="1234", place = "Zürich"))) # no NA hopefully print(AsciiToInt(chars8bit()))# -> 1:255 {if setting latin1 succeeded above} print(chars8bit(97:140)) try( Sys.setlocale(locale = "de_CH.utf-8") )# "try": just in case print(chars8bit(97:140)) ## typically looks different than above } ## Resetting to original locale .. works "mostly": lapply(strsplit(strsplit(myloc, ";")[[1]], "="), function(cc) try(Sys.setlocale(cc[1], cc[2]))) -> .scratch Sys.getlocale() == myloc # TRUE if we have succeeded to reset it