Maps of disease rates produced without careful consideration of the underlying population distribution may be unreliable due to the well-known small numbers problem. Smoothing methods such as Kernel Density Estimation (KDE) are employed to control the population basis of spatial support used to calculate each disease rate. The degree of smoothing is controlled by a user-defined parameter (bandwidth or threshold) which influences the resolution of the disease map and the reliability of the computed rates. Methods for automatically selecting a smoothing parameter such as normal scale, plug-in, and smoothed cross validation bandwidth selectors have been proposed for use with non-spatial data, but their relative utilities remain unknown. This study assesses the relative performance of these methods in terms of resolution and reliability for disease mapping.
Using a simulated dataset of heart disease mortality among males aged 35 years and older in Texas, we assess methods for automatically selecting a smoothing parameter. Our results show that while all parameter choices accurately estimate the overall state rates, they vary in terms of the degree of spatial resolution. Further, parameter choices resulting in desirable characteristics for one sub group of the population (e.g., a specific age-group) may not necessarily be appropriate for other groups.
We show that the appropriate threshold value depends on the characteristics of the data, and that bandwidth selector algorithms can be used to guide such decisions about mapping parameters. An unguided choice may produce maps that distort the balance of resolution and statistical reliability.