We study the problem of testing the goodness of fit of the data to the uniform distribution over many categories under a minimax setting in which the class of alternatives is obtained by the removal of an Lp ball of a radius r around the uniform rate sequence. We provide an expression describing the sharp asymptotic of the minimax risk in terms of these parameters as the number of categories goes to infinity.
Our result settles an open question related to an extensive line of work over the last two decades. Furthermore, it allows the comparison of the many estimators previously proposed for this problem at the constant level, rather than at the rate of convergence of the risk or the scaling order of the sample complexity.
The minimax test mostly relies on collisions in the very small sample limit but behaves like the chisquared test for moderate and large sample sizes. Empirical studies over a range of problem parameters show that the asymptotic estimate of the minimax risk is accurate in finite samples and that the asymptotic minimax test is significantly better than the chisquared test or a test that only uses collisions.
Alon Kipnis is a senior lecturer at the Schools of Computer Science at Reichman University. He received the Ph.D. in electrical engineering from Stanford University in 2017. Between 2017-2021 he was a postdoctoral research scholar and a lecturer at the Department of Statistics at Stanford, advised by David Donoho. Dr. Kipnis' research combines mathematical statistics, information theory, and ambitious data science.