0

Advertisement

Here's `setdiff`

normal behaviour:

```
x <- rep(letters[1:4], 2)
x
# [1] "a" "b" "c" "d" "a" "b" "c" "d"
y <- letters[1:2]
y
# [1] "a" "b"
setdiff(x, y)
# [1] "c" "d"
```

… but what if I want `y`

to be taken out *only once*, and therefore get the following result?

`# "c" "d" "a" "b" "c" "d"`

I'm guessing that there is an easy solution using either `setdiff`

or `%in%`

, but I just cannot see it.

# Answer

1

Advertisement

`match`

returns a vector of the positions of (first) matches of its first argument in its second. It's used as an index constructor:

```
x[ -match(y,x) ]
#[1] "c" "d" "a" "b" "c" "d"
```

If there are duplicates in 'y' and you want removal in proportion to their numbers therein, then the first thing that came to my mind is a for-loop:

```
y <- c("a","b","a")
x2 <- x
for( i in seq_along(y) ){ x2 <- x2[-match(y[i],x2)] }
> x2
[1] "c" "d" "b" "c" "d"
```

This would be one possible result of using the tabling approach suggested below. Uses some "set" functions, but this is not really a set problem. Seems somewhat more "vectorised":

```
c( table(x [x %in% intersect(x,y)]) - table(y[y %in% intersect(x,y)]) ,
table( x[!x %in% intersect(x,y)]) )
a b c d
0 1 2 2
```

Answer author 42

Advertisement

Tickanswer.com is providing the only single recommended solution of the question R: Non-greedy version of setdiff? under the categories i.e r , . Our team of experts filter the best solution for you.