c++ - What is the most efficient way of selecting a random element from a long (and reasonably) sparse vector? -

May 15, 2014

i have long, reasonably sparse boolean vector, want iteratively select random elements from, , wondering efficient way of doing be.

the vector can around 100,000 elements long, , 1 in every 20 elements "true" @ 1 time.

the selection of 1 of these elements, result in making other elements available selection; can't single, initial pass of boolean vector indices of available elements , shuffle vector , pop elements, because list of available elements changes.

i have worked out couple of ideas, can't tell best. insight appreciated.

method 1:

given input boolean vector create boolean vector b    // store selected elements create int vector c        // store available element indices  while stopping condition not met:     each element in a:         if "true":             append index of c     generate random integer between 0 , length of     set i-th element of c in "false"     set i-th element of c in b "true"     compute new "true" values of

method 2:

given input boolean vector create boolean vector b    // store selected elements create int vector c        // store available element indices  each element in a:     if "true":         append index of c shuffle c while stopping condition not met:     pop element of c     set i-th element of c in "false"     set i-th element of c in b "true"     compute new "true" values of     if new values in computed:         append index of new available element c          shuffle c

because not every selection results in change set of available elements, think method 2 potentially better 1, except fact not sure how effort shuffling long vector cause.

method 3:

given input boolean vector create boolean vector b    // store selected elements while stopping condition not met:     generate random integer between 0 , length of     if "true" in a:         set in "false"         set in b "true"         compute new "true" values of

this final way seems bit naive , simple, figured if there 1 in every 20 elements being true (except last group of elements, when no more can added ones selected), on average need 20 tries find selectable element, less effort doing full pass of input vector, or shuffling vector of available indices (especially if vectors in question quite long). finding last few hard, keep track of how many have been selected, , once amount left gets below level change how selected final lot.

does have idea might more efficient? implementation in c++ if makes difference.

thanks help

you can change representation of sparse vector following -

primary vector (the vector have right now)
true vector (a list of "true" indices)

your operations become -

insert:        check if in primary vector     if false, set true , add true vector  delete:     check if in primary vector     if true, set false , remove true vector swapping     last element , reducing size

(you need pointers primary vector true vector this).

random:     generate random index j size of (true vector)     return true vector[j]

all operations can done o(1) complexity.

Search This Blog

Force Net

c++ - What is the most efficient way of selecting a random element from a long (and reasonably) sparse vector? -

Comments

Post a Comment

Popular posts from this blog

python - Operations inside variables -

Generic Map Parameter java -

arrays - What causes a java.lang.ArrayIndexOutOfBoundsException and how do I prevent it? -