Spatial and ecological modelling of cancer risk: statistical tools for analysing gastrointestinal tract cancer sites in the Caspian region of Iran MohebbiMohammadreza 2017 High incidence rates of gastrointestinal tract cancers have been reported in the Caspian region of Iran. The aims of this thesis were to describe the geographical patterns of gastrointestinal tract cancer incidence based on cancer registry data, identify significant “hot spots” or clusters of cancer, investigate the association between these cancers and the region’s dietary patterns and socioeconomic factors and to map cancer incidence ratios after both the adjustment for those risk factors and the removal of random and geographic variations from area specific age standardised incidence ratios (SIRs). This thesis analyses the geographic distribution of gastrointestinal tract cancers in the Caspian region of Iran during 2001 – 2005. The Babol Cancer Registry, which covers the two major northern Iranian provinces of Mazandaran and Golestan (total population = 4,484,622) was used to identify new gastrointestinal tract cancer cases. Age-specific cancer incidence rates were calculated for 7 gastrointestinal tract cancer sites in 26 wards of the Mazandaran and Golestan provinces. The main epidemiological findings of the studies carried out in this thesis were: Clusters of high incidence were identified in esophageal, stomach, colorectal and liver cancer for both sexes, as well as a possible cluster of pancreatic cancer in males. Evidence of systematic clustering for esophageal and stomach cancers in men and women and both sexes combined were found. For this thesis, an ecological analysis was incorporated into a hierarchical disease mapping study in order to estimate cancer SIRs and explore regional-level risk factors. In order to avoid multicollinearity among risk factors and reduce the dimensionality of the risk indicators, factor analyses were applied. Generalised linear mixed model with Poisson distribution and spatially autocorrelated random effect and ecological regression methods were used to estimate agglomeration-specific cancer SIRs, investigate spatial variations, and explore and quantify the associations between regional characteristics and cancer incidence. Esophageal and stomach cancers were associated with aggregated risk factors, including income, urbanisation, and dietary patterns. Esophageal and stomach cancer SIRs were lower in urban areas, and were also lower in areas of high income. Esophageal cancer SIRs were lower in areas with higher proportions of people having unrestricted food choice and higher in areas with higher proportions of people with restricted food choice. Furthermore, regional characteristics explored as potential risk factors of cancer incidence can provide a better understanding of regional variations of cancer occurrence rates. This can help in planning future public health prevention programs. The methodological objectives of this thesis were to produce evidence-based maps of cancer incidence by means of spatial statistical modelling; to evaluate and advance the application of methodology in the analysis of spatially correlated count data; and to compare and select the most suitable approach of analysis of cancer incidence for one particular geographic region in order to establish underlying patterns of cancer risk over space and environmental factors. Three approaches were employed: 1) classical geo-statistical methods based on variograms, global indexes of spatial autocorrelation and scanning local rates for detecting local clusters of cancer incidence, 2) Poisson regression in the context of generalised linear mixed models adjusted for geographically autocorrelated count data, and 3) count models in a Bayesian hierarchical context using Markov Chain Monte Carlo (MCMC) methods. All model-based analyses used hierarchical spatial models for areal count data based on distance-base or neighbourhood-base autoregressive structures, which were specifically adapted for the analyses of cancer SIRs in the Caspian region of Iran. These models allowed an in-depth examination of the spatial structure in cancer SIR data through the possibility of modelling different distributions for disease counts to deal with overdispersion, different forms of spatial heterogeneity via the choice of different spatial priors and the ability to combine spatial and independent area random effects. These models controlled efficiently for the effects of sparse data and differing sample sizes by shrinking less reliable estimates towards a local mean. Moreover, the easy inclusion of covariate effects into the models provided more reliable estimates for covariate effect and accurate confidence intervals and it allowed for investigation of the influence of groups of factors on observed spatial patterns.