;

Beyond Bloom Filters: From Approximate Membership Checks to Approximate State Machines


View PaperPublic Review (by Ellen Zegura)

Flavio Bonomi, Cisco Systems
Michael Mitzenmacher, Harvard University
Rina Panigraphy, Stanford University
Sushil Singh, Cisco Systems
George Varghese, Cisco Systems / UCSD

Abstract
Many networking applications require fast state lookups in a concurrent state machine, which tracks the state of a large number of flows simultaneously. We consider the question of how to compactly represent such concurrent state machines. To achieve compactness, we consider data structures for Approximate Concurrent State Machines (ACSMs) that can return false positives, false negatives, or a “don’t know” response. We describe three techniques based on Bloom filters and hashing, and evaluate them using both theoretical analysis and simulation. Our analysis leads us to an extremely efficient hashing-based scheme with several parameters that can be chosen to trade off space, computation, and the impact of errors. Our hashing approach also yields a simple alternative structure with the same functionality as a counting Bloom filter that uses much less space. We show how ACSMs can be used for video congestion control. Using an ACSM, a router can implement sophisticated Active Queue Management (AQM) techniques for video traffic (without the need for standards changes to mark packets or change video formats), with a factor of four reduction in memory compared to full-state schemes and with very little error. We also show that ACSMs show promise for real-time detection of P2P traffic.

Comments

  1. Posted by Paul Tarjan on Friday August 18, 6:42am
    Comment from KaiZheng:

    The idea is indeed very interesting and creative, well done!
    But I have some minor questions about the formulas and some logical aspect, in the paper (Page 5 and 6):

    1) In page 5, it seems that the formula for the false positive f is with typos, where (1-e^(kn/m))^k should be (1-e^(-kn/m))^k;

    2)In page 6, the formula at the end of the left col, why parameter m is missing/omitted? It seems not that correct;

    3)For section 3.3, the Stateful Bloom Filter Approach, the paper describes that for a lookup, “if all cell values have value i or DK (and at least one cell has value i), return i”. And later, the paper also declares that such mechanism “guarantee that never return an incorrect value for a flow”.
    But I wonder, what if some of the DK (i.e. “?”) values are contributed by the element(s) with value other than i? For example, for a lookup, we result in k-1 “i”s and one DK; however,this DK is the result of the collision of one element with value i-1 and the other with i+1. Actually, in this case, the item is not in the set. But it would return i in the proposed scheme…..Am I right?

    It seems that the issue did not addressed in the current manuscript.
  2. Posted by Tal Achituv on Monday December 31, 8:39pm
    Regarding comment #1.3:

    When looking-up the value for a flow you can ignore all DK values, as a matter of fact a simple analysis shows that if all non-DK cells agree (contain the same value) that is enough to show that this is the value that should be returned for the flow.

    Consider the case where all cells contain DK except one which contains the value 'i', this means that when this flow was last inserted (unless it's a false-positive and then it's a vacuous statement to say that it was ever inserted) this specific cell was empty and the flow MUST have had the value 'i'. the proof is simple, if the cell was not empty - it would have turned to DK. if any other flow would have inserted a different value to that cell - it would have turned to DK as well. we are left with the conclusion that IF the flow was inserted to the set IT MUST HAVE BEEN inserted with the value 'i'.
  3. Posted by shark sharok on Tuesday September 8, 11:01am
    http://yalasex.info/
  4. Posted by shark sharok on Tuesday September 8, 11:01am
    http://yalasex.info/

Login to post a comment