Recoding example

Author

Clay Ford

Published

October 6, 2023

I was asked to help with the following scenario.

The patron had a list object similar to the following. Notice the y element has ID numbers in no particular order.

Show code
y <- paste0("ID00000",101:200)
set.seed(123)
lst <- list(x = rnorm(100), y = sample(y))
lapply(lst, head)
$x
[1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774  1.71506499

$y
[1] "ID00000130" "ID00000194" "ID00000189" "ID00000116" "ID00000188"
[6] "ID00000154"

They wanted to update the names in the y element according to a mapping that they had in a data frame. The mapping looked something like this.

Show code
set.seed(321)
new_y <- replicate(n = 100, paste(sample(letters, size = 4), collapse = ""))
d <- data.frame(y = sort(lst$y), 
                new_y)
head(d)
           y new_y
1 ID00000101  vrmp
2 ID00000102  zqdo
3 ID00000103  kyrv
4 ID00000104  oibk
5 ID00000105  rdnw
6 ID00000106  bdty

So in the list above where it has “ID00000101”, they wanted to replace it with “vrmp”. Here was my approach.

First make the “y” column the row names of d:

rownames(d) <- d$y
head(d)
                    y new_y
ID00000101 ID00000101  vrmp
ID00000102 ID00000102  zqdo
ID00000103 ID00000103  kyrv
ID00000104 ID00000104  oibk
ID00000105 ID00000105  rdnw
ID00000106 ID00000106  bdty

By adding row names to the data frame we have created a lookup table. Use the elements in lst$y as row names to lookup the “y_new” value in d. This returns a vector.

d[lst$y,"new_y"]
  [1] "ogqr" "sgpd" "ahrk" "laco" "fnjd" "igrd" "xblh" "pvkm" "rnly" "tfkq"
 [11] "ndck" "qewf" "ourn" "sgjr" "lgxc" "ugmk" "yrqc" "fcid" "zbat" "mpwa"
 [21] "ulzj" "rzfk" "mqen" "dhgy" "vhkc" "qxpz" "nucd" "mcek" "cwsy" "mftn"
 [31] "ctrv" "xsid" "dpms" "mbua" "cvma" "qhio" "mszb" "oygl" "tbwf" "ragi"
 [41] "veht" "inmg" "pszm" "gmas" "argq" "dbig" "bjsg" "oyzs" "ncwa" "hlvj"
 [51] "oknh" "hwrk" "pcat" "ogcm" "rdqg" "gzmw" "gzus" "mlbw" "dojv" "cjvr"
 [61] "nmwd" "ixlp" "wbfy" "eagd" "pcvj" "izku" "jhsy" "ijaf" "idaj" "bdty"
 [71] "mufc" "vrmp" "txzu" "ptag" "xout" "upec" "wmhq" "ndih" "ivms" "bsuy"
 [81] "mfwz" "pvdm" "dgje" "sixy" "vart" "rdnw" "jbil" "oibk" "kyrv" "uhal"
 [91] "ytiw" "zqdo" "mtxg" "sxkn" "zdha" "mzci" "cmvw" "cgkx" "lqnv" "pnam"

We can then assign the vector into the list object and replace the “ID00000xxx” values with the desired 4-letter names.

lst$y <- d[lst$y,"new_y"]
lapply(lst, head)
$x
[1] -0.56047565 -0.23017749  1.55870831  0.07050839  0.12928774  1.71506499

$y
[1] "ogqr" "sgpd" "ahrk" "laco" "fnjd" "igrd"