Lee Rhodes
01/26/2021, 8:47 PMMayank
Lee Rhodes
01/26/2021, 10:48 PM((getUpperBound(2) / getEstimate()) -1) * sqrt(K)/2
. This will be factor of how much your intersection error exceeds the nominal RSE of the sketch. If this results in a 2, that means your estimated error of that operation will be about twice as large or, in this case, about +/- 6.25% (at 95% confidence).
At least this allows you to monitor the relative error of intersections and even be able to determine which operations caused the largest increase in error.
You can also try scheduling the sequence of your set operations so that all of your intersections occur either early in the sequence or and the end. Depending on your data, you might find that reordering the sequence might help.
Other than that, know that the intersection error of the theta sketches approaches the theoretical limit of what is possible, given a streaming algorithm and limited space.
I hope this helps.Mayank
Lee Rhodes
01/26/2021, 10:52 PMMayank
Lee Rhodes
01/26/2021, 11:12 PMKen Krugler
01/26/2021, 11:27 PMMayank
Mayank
Lee Rhodes
01/26/2021, 11:30 PMLee Rhodes
01/26/2021, 11:33 PMKarin Wolok
Lee Rhodes
01/27/2021, 1:08 AMLee Rhodes
01/27/2021, 1:10 AMLee Rhodes
01/27/2021, 5:31 AMKen Krugler
01/27/2021, 3:12 PMLee Rhodes
01/27/2021, 5:51 PMKarin Wolok
Ken Krugler
01/28/2021, 1:08 AMKarin Wolok
Karin Wolok
Lee Rhodes
01/28/2021, 4:48 AMMayank
Lee Rhodes
02/02/2021, 6:52 PMLee Rhodes
02/02/2021, 6:52 PMKen Krugler
02/02/2021, 7:03 PMLee Rhodes
02/02/2021, 7:04 PM