Sampling : subset of data to be analyzed such that it has approximately the same properties of the original data
Random Sampling:Each element has equal probability of being selected
Reservior Sampling : A technique to maintain online random sample
Reservior Sampling Algorithm
1.Used to maintain an online random sample
2.Maintains a sample of size k, known as reservoir
3.Every new element has a probability k/n of replacing an old element in the reservoir
4.Let us say, k=1: An item in a stream of length n replaces the element in the reservoir with probability 1/n
Hot List Problem
Goal to Achieve
1.continuously maintain a list of the top-k most frequent elements in a stream
2.rank the items and absolute count is immaterial
Hot List Problem + Frequent Algorithm
Hot List Problem - Space Saving Algorithm
All the messages below are just forwarded messages if some one feels hurt about it please add your comments we will remove the post. Host/author is not responsible for these posts.
No comments:
Post a Comment