Analysis of left-censored data with zeros


Carl Gogolak,, Environmental Measurements Laboratory, US Department of Homeland Security, 201 Varick Street, Fifth Floor, New York, NY 10014
The concentration of a contaminant measured in a particular medium might be distributed as a positive random variable when it is present, but it may not always be present. Suppose that in the underlying population, the value zero occurs with probability delta, and that the conditional distribution given the value is nonzero, is that of a positive random variable. If there is a level below which the concentration cannot be distinguished from zero by the analytical apparatus, a sample from such a population will be censored on the left. The presence of both zeros and positive values in the censored portion of such samples complicates the problem of estimating the parameters of the underlying positive random variable and the probability of a zero observation. Using the method of maximum likelihood, it is shown that the solution to this estimation problem reduces largely to that of estimating the parameters of the distribution truncated at the point of censorship. The maximum likelihood estimate of the proportion of zero values follows directly. Simulation studies were performed to study the small sample behavior of the estimates, and to compare them to previously suggested methods for handling such data. The estimation method was used to fit several different distributions to a set of severely censored experimental monitoring data.