r - Choose one cell per row in data frame -
i have vector tells me, each row in date frame, column index value in row should updated.
> set.seed(12008); n <- 10000; d <- data.frame(c1=1:n, c2=2*(1:n), c3=3*(1:n)) > <- sample.int(3, n, replace=true) > head(d); head(i) c1 c2 c3 1 1 2 3 2 2 4 6 3 3 6 9 4 4 8 12 5 5 10 15 6 6 12 18 [1] 3 2 2 3 2 1 this means rows 1 , 4, c3 should updated; rows 2, 3 , 5, c2 should updated (among others). cleanest way achieve in r using vectorized operations, i.e, without apply , friends? edit: and, if @ possible, without r loops?
i have thought transforming d matrix , address matrix elements using one-dimensional vector. haven't found clean way compute one-dimensional address row , column indexes.
if willing first convert data.frame matrix, can index elements-to-be-replaced using two-column matrix. (beginning r-2.16.0, possible data.frames directly.) indexing matrix should have row indices in first column , column indices in second column.
here's example:
## create subset of data set.seed(12008); n <- 6 d <- data.frame(c1=1:n, c2=2*(1:n), c3=3*(1:n)) <- seq_len(nrow(d)) # vector of row indices j <- sample(3, n, replace=true) # vector of column indices ij <- cbind(i, j) # 2-column matrix index 2-d array # (this extends smoothly higher-d arrays.) ## convert matrix dmat <- as.matrix(d) ## replace elements indexed 'ij' dmat[ij] <- na dmat # c1 c2 c3 # [1,] 1 2 na # [2,] 2 na 6 # [3,] 3 na 9 # [4,] 4 8 na # [5,] 5 na 15 # [6,] na 12 18 beginning r-2.16.0, able use same syntax dataframes (i.e. without having first convert dataframes matrices).
from r-devel news file:
matrix indexing of dataframes 2 column numeric indices supported replacement extraction.
using current r-devel snapshot, here's looks like:
d[ij] <- na d # c1 c2 c3 # 1 1 2 na # 2 2 na 6 # 3 3 na 9 # 4 4 8 na # 5 5 na 15 # 6 na 12 18
Comments
Post a Comment