Class RandomSelector<T>

java.lang.Object
org.plumelib.util.RandomSelector<T>
Type Parameters:
T - the type of elements being selected over

public class RandomSelector<T> extends Object
RandomSelector selects k elements uniformly at random from an arbitrary iterator, using O(k) space. A naive algorithm would use O(n) space. For example, selecting 1 element from a FileStream containing 1000 elements will take O(1) space. The class takes as input the number k during initialization and then can accept() any number of Objects in the future. At any point in time, getValues() will either return k randomly selected elements from the elements previous accepted or if accept() was called fewer than k times, will return all elements previously accepted.

The random selection is independent between every constructed instance of RandomSelector objects, but for the same instance, multiple calls to getValues() are not independent. Making two calls to consecutive getValues() without an accept() in between will return two new Lists containing the same elements.

A second mode allows for a fixed probability of randomly keeping each item as opposed to a fixed number of samples.

SPECFIELDS:
values : Set : The values chosen based on the Objects observed
observed : int : The number of Objects observed
numElts : int : The number of elements to choose ('k' above)
keepProbability: double : The percentage of elements to keep
selector_mode : {FIXED,PERCENT} : either fixed amount of samples or fixed percent.

Example use:


 // randomly selects 100 lines of text from a file
 List selectedLines = null;
 try {
    BufferedReader br = new BufferedReader(new FileReader("myfile.txt"));
    RandomSelector selector = new RandomSelector(100);
    while (br.ready()) {
      selector.accept(br.readLine());
    }
    selectedLines = selector.getValues();
  }
  catch (IOException e2) { e2.printStackTrace(); }
 
  • Constructor Summary

    Constructors
    Constructor
    Description
    RandomSelector(double keepProbability, Random r)
    Creates a new RandomSelector.
    RandomSelector(int numElts)
    Creates a new RandomSelector.
    RandomSelector(int numElts, Random r)
    Creates a new RandomSelector.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    accept(T next)
    When in fixed sample mode, increments the number of observed elements i by 1, then with probability k / i, the Object 'next' will be added to the currently selected values 'values' where k is equal to 'numElts'.
    Returns values, modifies none.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • RandomSelector

      public RandomSelector(int numElts)
      Creates a new RandomSelector.
      Parameters:
      numElts - the number of elements intended to be selected from the input elements
    • RandomSelector

      public RandomSelector(int numElts, Random r)
      Creates a new RandomSelector.
      Parameters:
      numElts - the number of elements intended to be selected from the input elements
      r - the seed to give for random number generation
    • RandomSelector

      public RandomSelector(double keepProbability, Random r)
      Creates a new RandomSelector.
      Parameters:
      keepProbability - the probability that each element is selected from the oncoming Iteration
      r - the seed to give for random number generation
  • Method Details

    • accept

      public void accept(T next)
      When in fixed sample mode, increments the number of observed elements i by 1, then with probability k / i, the Object 'next' will be added to the currently selected values 'values' where k is equal to 'numElts'. If the size of values exceeds numElts, then one of the existing elements in values will be removed at random.

      When in probability mode, adds next to 'values' with probability equal to 'keepProbability'.

      Parameters:
      next - value to be added to this selector
    • getValues

      public List<T> getValues()
      Returns values, modifies none.
      Returns:
      values