In parasitology, the quantitative study of parasitism in a host population involves the use of statistics to draw meaningful conclusions from observations of the prevalence and intensity of parasitic infection.
Quantifying parasites in a sample of hosts or comparing measures of infection across two or more samples can be challenging.
The parasitic infection of a sample of hosts inherently exhibits a complex pattern that cannot be adequately quantified by a single statistical measure. As the use of two or more separate indices is advisable, only two or more separate statistical tests can reliably compare infections different samples of hosts.
A few of the available statistical measures have markedly different biological interpretations, while others have more-or-less overlapping interpretations or no interpretations at all. Therefore, one should apply measures that have clear and separate biological interpretations thus do not predict each other.
Parasite individuals typically exhibit an aggregated (right-skewed) distribution among host individuals; most hosts harbour few if any parasites and a few hosts harbour many of them. This quantitative feature of parasitism makes the application of many traditional statistical methods inappropriate by violating assumptions about the underlying data distribution, requiring the use of more advanced computationally-intensive methods.
Always give the host sample size. In most cases, this is expressed as the number of hosts individuals examined. (Exceptionally, other units may also be used for special cases.)
Describe prevalence. This is the proportion of infected hosts among all the hosts examined. Give the confidence interval (CI) of prevalence (either as a Clopper-Pearson interval or as adjusted Wald/Sterne's interval) to indicate the accuracy of the estimation (use of the confidence intervals belonging to the 95% probability is advisable).
Describe mean intensity. This is the mean number of parasites found in the infected hosts (the zeros of uninfected hosts are excluded). Since sample size and prevalence are known, mean intensity defines the quantity of parasites found in the sample of hosts. Given the typical aggregated (right-skewed) distribution of parasites, its actual value is highly dependent on a few extremely infected hosts. Also give CI to indicate the accuracy of the estimation. Use bias-corrected and accelerated bootstrap (BCa Bootstrap) to get this confidence interval.
Describe median intensity. This is the median number of parasites found in infected hosts (the zeros of uninfected hosts are excluded). Median intensity shows a typical level of infection among the infected hosts. Use exact CI to indicate the accuracy of the estimation.