Hit rate testing round 2: Claim respawn rates

Read this Einstein

and btw I do NOT have a theory ... I just use common sense ... + I can count

https://courses.lumenlearning.com/boundless-statistics/chapter/the-law-of-averages/

Have you read what you've linked?
"The law of averages is a lay term used to express a belief that outcomes of a random event will “even out” within a small sample. As invoked in everyday life, the “law” usually reflects bad statistics or wishful thinking rather than any mathematical principle."
 
let's say, leeloo does solo run, 100 drops, gets higher than average claim rate, say 40% claim rate, now joins your testing, normalisation happens and leeloo gets lower than average claim rate

Example:
assuming 30% is average claim rate.
130*0,3 = 39 claims
40 claims have happened during the first 100 solo drops, normalisation kicks in during your test and leeloo gets only 2 claims in those 30 drops (6,67%HR).

If you extend it, like leeloo doing another 30 drops after your test with 6 hits.
160 drops * 0,3 = 48 claims.
40 claims + 2 claims + 6 claims = 48 claims.

So with such a small sample size, there's the risk that the normalisation after a higher than average run just kicks in during your test and you're getting a false-positive for your theory.

That's why you need to run big consecutive sample-size, the impact of solo runs before joining your test, will become neglectable in large sample size, giving you a more realistic and valid result.

What exactly do you mean "normalization kicks in"?

You seem to be implying that there is an external source of (strong) dependence between the samples? If this is the case, then the computation of the confidence intervals may indeed be way off.
 
What exactly do you mean "normalization kicks in"?

You seem to be implying that there is an external source of (strong) dependence between the samples? If this is the case, then the computation of the confidence intervals may indeed be way off.

should have put the normalisation into quotation marks.
Well, if you're above the expected value, say it's coded to be 30% HR , but you get 50% HR for a while, then there has to be a period of sub 30% HR in order for 50% HR to go down to 30%. no external source required, just nature of randomness..
 
should have put the normalisation into quotation marks.
Well, if you're above the expected value, say it's coded to be 30% HR , but you get 50% HR for a while, then there has to be a period of sub 30% HR in order for 50% HR to go down to 30%. no external source required, just nature of randomness..

In that case I completely agree with kingofaces that you do not need huge samples to get very accurate tests. The "normalization" in fact implies the opposite of what you said: the "nature of randomness" is very well behaved even with a sample size of n=30. The probability to get a big difference between the average of two binomial tests with the same success probability becomes negligible very fast and going from sample size 30 to 50 won't change much, and further going from 100 to 1,000 samples changes virtually nothing.
 
In that case I completely agree with kingofaces that you do not need huge samples to get very accurate tests. The "normalization" in fact implies the opposite of what you said: the "nature of randomness" is very well behaved even with a sample size of n=30. The probability to get a big difference between the average of two binomial tests with the same success probability becomes negligible very fast and going from sample size 30 to 50 won't change much, and further going from 100 to 1,000 samples changes virtually nothing.

is it?
binomial distribution:

n=30 (sample size)
p=0,3 (chance for hit)
k= amount of claims:

k=0 0,0025%
k=1 0,029%
k=2 0,18%
k=3 0,72%
k=4 2,1%
k=5 4,64%
k=6 8,3%
k=7 12,19%
k=8 15%
k=9 15,73%
k=10 14,16%
k=11 11%
k=12 7,49%
k=13 4,44%
k=14 2,3%
k=15 1,06%
k=16 0,4%
k=17 0,15%
k=18 0,04%
k=19 0,01%
k=20 0,003%
k=21 0,0006%
....
k=30 2,05*10^-14%

the OP wants to test if someone droping into the same spot makes a difference, however, when you only do 30 drops, then you don't know, wether the 2nd person may have just gotten the rare event of k=2 or if it really makes a difference.

that's why, for what the OP wants to find out a larger sample size is required.
 
So what happens if there is no randomness ?
 
let's say, leeloo does solo run, 100 drops, gets higher than average claim rate, say 40% claim rate, now joins your testing, normalisation happens and leeloo gets lower than average claim rate

What you call "normalization" or regression toward the mean is already wrapped into the sampling statistics as Kosh mentioned. Also keep in mind that this has been replicated between multiple treatments in the round 1 and 2 testing in terms of looking at how those 60 drops per run vary across a total of about 500 drops now. The problem you're running into is another thing we often teach intro statistics students not to do, the gambler's fallacy, where say if you had a bad streak while random sampling, you're due for a good streak. Previous samples don't affect the results future ones in these types of situations, and independence of observations if a key concept in statistics.

As for sample size, you can run into problems with rare events if you have n<10 pretty easily, which is why that's a general threshold researchers are told to avoid, but hit rate isn't prone to extreme outliers like TT returns are (globals, etc.). If the distribution wasn't appropriate, it would show up in the standard tests checking assumptions before doing things like an ANOVA or regression. In all the testing so far, the data looks pretty consistent, so you if you have data that satisfies the heavy burden needed for your claims, it's best to present it. What's been done here is pretty standard for statistical analysis though, so the claims I've seen so far here about sample size wouldn't carry very far in peer-review.
 
Last edited:
What you call "normalization" or regression toward the mean is already wrapped into the sampling statistics as Kosh mentioned. Also keep in mind that this has been replicated between multiple treatments in the round 1 and 2 testing in terms of looking at how those 60 drops per run vary across a total of about 500 drops now. The problem you're running into is another thing we often teach intro statistics students not to do, the gambler's fallacy, where say if you had a bad streak while random sampling, you're due for a good streak. Previous samples don't affect the results future ones in these types of situations, and independence of observations if a key concept in statistics.

So you did more testing than only 30 drops, that's good.
Issue #1: the issue with gambler's fallacy in video games, the gambler's fallacy only applies to independant events, but you don't know what the game's algorythm is like and wether or not the events are independant.
When tossing a coin, it may not be possible to dynamically adjust the chances (or bias) based on what results you had, but that is possible in video games.

Issue #2: you want to figure out wether or not the system is "biased". When you go first and someone drops right after you and you want to see if it makes a difference compared to droping where nobody has droped is basically a check for bias.

You do a check for bias in a system where you don't know wether or not the events are independant, under such circumstances a sample size of 30 is simply too small.
 
Last edited:
You do a check for bias in a system where you don't know wether or not the events are independant, under such circumstances a sample size of 30 is simply too small.

Over the different rounds of testing done, there was no indication the assumption of independence was violated, which is a standard check for any analysis like this. Sample size doesn't address that question at all though; that's usually a matter more for replication. There's been enough replication here for what was being tested, and even if pseudoreplication were a problem (e.g., lack of independence), that's generally more a problem with false positives (no significant differences were found anyways), not false-negatives. As Kosh mentioned, the data would already be addressing your concerns. This is how normal research is done in systems where you control what you can but have checks in place for variability you can't control.

If you have testable data saying otherwise, please provide it. That's ultimately the point of these threads to focus on data instead by avoiding the armchair theorizing and what-ifs that commonly happens for whatever someone's pet theory is or arguing about massive sample sizes not appropriate in formal scientific research.

If this kind of thing interests you, experimental design courses in statistics tailored to researchers usually covers it well. A lot of the things you brought up here are things commonly misunderstood by students like I've been seeing here, so that might help you out more than trying to teach it all in forums.
 
hmmm, this says it all... you did more than 30 drops to confirm your theory and rule out other causes didn't you?

It's sounding like you didn't read the original posts and subsequent replies, so it would probably be best to do that.
 
but you don't know what the game's algorythm is like and wether or not the events are independant.

What you can do, though, is to analyze separately the two possible coordinates: hitrate and TT return. I kept accurate long logs for, iirc, 1 year and a half (2016 & 2017), about 110k turnover mostly planetside, mostly lvl3 amped, anywhere from single to triple drops. Per 100 drops, hitrate had an absolute minimum of about 25% and an absolute maximum of about 45%, but the overwhelming majority was pretty spot on 32-33%. From my perspective, even if I am not even an amateur in statistics (almost ignorant would be more accurate), would result that the "normalization" happens on TT return, not on hitrate.

My thinking is that the process of mining has two possible outcomes: find or not. That should, in theory, result in a 50% average. Since the average is (or was) 33%, there are two possible explanations: either the possible outcomes are three (find with loot, find with no loot, no find), either the hitrate's average is forced because,

the TT per find has an impressive realm of possible outcomes of about 20 results (iirc), from 0,3x to 1000x.

As such, to me it appears obvious that the hitrate is distributed such as to "randomize" very small runs planetside or small runs indoor, but what truly randomizes the outcome is the TT.

Or, to put it otherwise, from the perspective of finding a claim, two tries are independent events (as long as they don't overlap within a certain timeframe), but from the perspective of how much that claim is worth, the two tries are not independent events. Then if we move to what exactly that TT is made of, then almost nothing is independent ingame, with variable weight of interdependencies (lower for lyst, higher for chalmon etc).

With apologies if I dropped like a fly in milk in a very technical discussion :D
 
My thinking is that the process of mining has two possible outcomes: find or not. That should, in theory, result in a 50% average. Since the average is (or was) 33%, there are two possible explanations: either the possible outcomes are three (find with loot, find with no loot, no find), either the hitrate's average is forced because,

Just a little mini-stats lesson, but for two possible events, that 50% only occurs when there are equal chances of either event with the special case (though common example) of flipping a balanced coin. In most real life examples, those "coins" are weighted towards one result or other giving you say a 25% chance of a "heads" or event occurring. Otherwise, you'd have a 50% chance of dying each day you are alive because there are two options (alive or dead). Obviously that rate can change over time, but the average chance of death is much lower at <1% each day.

I didn't get into TT returns a whole lot in this analysis because of how tricky it formally is, but that isn't really split up into 20 categories either. It's what is called a continuous variable where a TT value of a claim can range anywhere between ~0.05 to > 500,000. Once the TT value is calculated, the deposit size of small, ample, etc. is just descriptive.

I’ve had 100%+ TT since last fall and my net gains have been steadily increasing rather than flat lining or even reverting back to ~90-95% TT after globals, etc. It would take some much heavier lifting to have a formal experimental design to test those differences, moreso than the peer-reviewed journal article looking at Entropia mining posted in these threads awhile back, but it does tentatively look like not everyone is locked into the same TT for mining with variation around say 90% TT. Some may average out to 90% if they do X, Y, and Z with regression back to that mean over time as you allude to, but others could have closer to 100% as their true mean if they do different things. Like revealing locations with high MU though, that’s something where I wouldn’t reveal exactly what I do, but let others figure out what works best for them.
 
Last edited:
I just want to add to this.. probably common knowledge here, but Lyst and Oil are absolutely "filler" materials.

I agree with this and not only for lyst, oil,.. but also for belk, blaus, melchi and a few others :)

Why you think redulite is always hiding between belk or vesperdite between lyst/zinc :scratch2: (Caly only)

But for some ores its even more complicated cause :

1 - Some are capped / drops. I'm only getting 1 claim every 50 drops some even need 100 or more drops

2 - Knowing this I start testing when I get them after x hours again .... Meaning doing the exact drops again after 1,2,3,.... or more hours

3 - After knowing the exact timing for respawns, I start amp testing to see at what lvl they are capped

Fun isn't it :yay:

Sorry but no spoilers here for +300% resources :ahh:
 
Last edited:
I didn't get into TT returns a whole lot in this analysis because of how tricky it formally is

That was the point which I was trying to make, but I lack the vocabulary. The output of mining can be expressed or measured in two ways: hitrate and tt value. Of them, hitrate is the obvious one, the events are *there*, while for tt value we can only guess.

So if we have an outcome involving two variables, of which one is almost clear and visible, while the other is obscure, it is safer to measure only the clear one. If we discover this one as being static (or close to static), then is safe to assume that the perceived volatility of the whole outcome is due to the obscure variable. Which we can't control. But what we *can* control is the hitrate (or, more precisely, the droprate). Hence the usefulness of this whole thread, which Alukat was partially denying because "we don't know the algorithm".

It is somehow about the whole philosophy of this section of the forum, which so very often goes toward the scope of hitting the Graal in tt terms, while the actual success regards mannerism, habit and obvious things.

Thank you for the lesson in stats, that was the direction I was aiming (either 3 outcomes, either "forced"). For the rest, I didn't alluded to anything, I am over 105% for what I measured, I don't believe in "9x%" as a sort of hard-coded limitation, but a trend given some conditions. I might be totally off, but as a fellow miner you can surely understand that between dropping a probe and digging the claim, per day of playing, there's plenty of time left for wondering at all kind of possibilities :laugh:
 
For the rest, I didn't alluded to anything, I am over 105% for what I measured, I don't believe in "9x%" as a sort of hard-coded limitation, but a trend given some conditions.

Just in case there was confusion, when I said allude, I meant that you touched on the idea that hit or TT rates will trend to a certain value, but I was just filling in that where that trend goes depends on conditions as you said here now.

For hit rate, it definitely does not appear to be static overall, but that in a given time and location, HR will be at a certain average for an unknown amount of time. I have some other data I'm still sussing through, but I do have some areas where I can say the the mean HR was 20% for one area and the mean for another was about 40% for another tested during the same day with a statistically significant difference between the two.

TT is indeed tricky, but I'm looking at some ways it can be tackled. I already did some finder decay testing showing finder decay doesn't affect TT, but in that case I had to remove outliers (unamped, anything above size 6). If I want to model the full distribution of average TT size though and give some better accuracy, I need a higher sample size than I do for HR to make sure the distribution used in the model fits correctly as it's definitely not a normal bell curve. Still a work in progress.

I'm definitely glad folks can get something out of these threads. There's a lot that can be done without needing 100k drops to make any inferences.
 
Depends what exactly you're hoping to understand. For ease of representation, I see HR and TT outcome as interdependent values, with a conditioning between them similar with communicating vessels. Therefore, if you would find a way to have perpetual (let's say) 42% HR, then the TT will also adjust to perpetual +-2.33 ped per claim (asuming unamped ores). Therefore I prefer the chaos of having both variable, especially since high TT materials (higher than 0.1tt per piece or so) bring their own branch of complications. As actual causality, nomatter what domain I tried, I was able to coroborate success with a few things which are rather philosophical in exprimation: persistence, consistence, a certain common sense (e.g. don't drop X probes in one place) and of course the all mighty .xls for tracking. Gl & hf
 
Back
Top