Merge in R
Opened this issue · 1 comments
I want to merge two data.frames by their row.names. When I do so I get an extra column named "Row.names":
> x
[,1] [,2] [,3]
a 1 3 5
b 2 4 6
> y
[,1] [,2] [,3]
aaa 11 13 15
b 12 14 16
> merge(x, y, by=0, all=TRUE)
Row.names V1.x V2.x V3.x V1.y V2.y V3.y
1 a 1 3 5 NA NA NA
2 aaa NA NA NA 11 13 15
3 b 2 4 6 12 14 16
Is there an easy way to make merge keep the row.names as row.names rather than as a new column?
I did this in a naïve way, but I'm sure there a way to do it inside the merge function, but didn't find it on the web..
> new_mat=merge(x, y, by=0, all=TRUE)
> rows = new_mat[,1]
> row.names(new_mat)= rows
> new_mat = new_mat[,2:length(colnames(new_mat))]
> new_mat
V1.x V2.x V3.x V1.y V2.y V3.y
a 1 3 5 NA NA NA
aaa NA NA NA 11 13 15
b 2 4 6 12 14 16
Thanks,
Rachelly.
I don't use merge
much but I think the extra column is created because data.frame
cannot have duplicated row names, and one still want to be able to track the original row name.
I guess tha tin your case you expect to have at most one-one relationship between rows in x
and y
.
The only improvement I can see here is to slightly shorten your code is to use negative index to remove the first column instead of passing the index of vector of the columns to keep:
new_mat=merge(x, y, by=0, all=TRUE)
rownames(new_mat) <- as.character(new_mat[,1])
new_mat <- new_mat[,-1]