[sldev] Cache speed experiment & results...

Kent Quirk (Q Linden) q at lindenlab.com
Wed Jun 4 14:17:11 PDT 2008


On Jun 4, 2008, at 4:25 PM, Robin Cornelius wrote:

> darien_caldwell at comcast.net wrote:
>> This reminds me of something I once pondered. Do you think the  
>> UUIDs are random enough, that one could rely on a comparison of  
>> only the first 8 digits as a sort of quick and dirty  
>> identification? What would the chance be of a false match in such a  
>> case?
>>
>
> Well for the 1st digit there is a 1/16 chance of a hit, for 2 digits  
> its
> 1/256 chance for the Nth digit its a 1/(16^N) chance, so for your
> specific question its 1/4294967296 (1 in about 4.2 billion).
>
> Chance of winning the UK lottery is about 1/14 Million for a  
> perspective
> BUT i can generate *vastly* more assets than i can ever buy lottery
> tickets and they are being generated by a lot of sources. But 8 may
> provide a useful sort providing you do allow for the possibility of a
> collision and have a fall back to higher precision.
>
> Either that or i've gone nuts and cant do math(s) any more, which is  
> not
> an unreasonable assumption.

There are a couple more issues here:

a) UUIDs are constructed from a collection of bytes, some of which are  
a timestamp. Using a subset of them might well lead to collisions.
b) The probability of a collision if you only use 8 bits is about 1.1%  
if you have 10K items, and about 68% if you have 100K items. That  
seems WAY too high to me. It's the birthday "paradox" -- the  
probability that two people have the same birthday. In python, the  
formula is:

import math
def birthday(n, bits):
     return 1. - math.exp((-(n*n) / (2.*pow(2, bits))))

 >>> birthday(10000, 32)
0.011574031737030754
 >>> birthday(100000, 32)
0.68781309569379223
 >>> birthday(1000, 32)
0.00011640854582628535


Why bother? Comparing strings is slowest if they agree, and fastest if  
they disagree...so you only pay for a full length comparison if  
they're the same, which is then swamped by what you do with the item  
once you've found it. This definitely feels like an unnecessary  
"optimization".





More information about the SLDev mailing list