Department of Statistics

Browse

Recent Submissions

Now showing 1 - 10 of 10
  • Item
    Validation of Malaria Predictive Model with Missing Covariates
    (2016-09-29) Duah, D.
    This study develops a statistical methodology for validating a predictive clinical malaria model when data has missing values in predictor variables. Using the logistic regression framework, multiple imputation techniques for missing values and a penalized likelihood approach to avoid overfitting of the data were adopted. Models with different functional forms were built using known predictors (age, sickle cell, blood group, parasite density, and mosquito bed net use) and some malaria antibody specific antigens and FCGR3B polymorphism. Models were assessed through visualization and differences between the Area Under the Receiver Operating Characteristic Curve (AUROC) and Brier Score (BS) estimated by suitable internal cross-validation designs. The contributions of this research are in three folds: (i) addresses the statistical question on how to build and validate a risk prediction model in the presence of missing explanatory variables (ii) improves the general statistical approach for malaria epidemiology (iii) identifies potential malaria antibodies and FCGR3B polymorphism which should be the research focus in the search for potential malaria vaccine candidate
  • Item
    On the Estimation of Conditional Tail Index and Extreme Quantiles under Random Censoring
    (2016-09-29) Minkah, R.
    In the area of Statistics of Extremes, the main assumption on any set of univariate data is to regard them as a complete sample of independent and identically distributed observations from an unknown distribution function, F. However, in many real life applications such as survival analysis, observations are usually subject to random censoring and may be influenced by an underlying covariate information. In such case, the classical extreme value theory needs some adjustment to take into account the presence of censoring and covariates. In this presentation, we propose estimators of the conditional tail index and conditional extreme quantiles for heavy-tailed distributions in the presence of random censoring and covariate information. We compare the proposed estimators with the existing estimators in the literature in a large scale simulation study. The results show improvement in bias and median absolute deviation over the existing estimators of the conditional tail index and conditional extreme quantiles.
  • Item
    “Maximum Penalized Likelihood Approach for Current Status Data with Informative Censoring Based on Frailties”
    (2017-02-09) Faisal, A.; Mettle, F.O.
    Current status data arise in many fields including epidemiological surveys and animal carcinogenicity experiments. By current status data we mean each subject in a study is observed only once at a certain time to ascertain whether or not the event of interest has occurred. This means that for current status data, the event (failure) time is not exactly observed but only known to be smaller or greater than the observation (censoring) time. The standard assumption in survival data analysis is that the observation time is non-informative, that is, the observation time is unrelated to the failure time. However, this assumption may not be valid in some circumstances, especially if the subject is lost-to-follow-up before the study ends. The examination time for such a subject can potentially be linked to the failure time and hence the examination time will be said to be informatively censored. To avoid severely misleading inferences that can result from failing to account for informative censoring, shared frailty models have been proposed in literature to account for informative censoring with current status data. The most commonly used estimation procedure is the Expectation-Maximization (EM) algorithm, but this approach yield discrete estimation of the distribution and possibly a negative hazard estimate. Thus the EM algorithm might not only be able to provide an accurate trend of how the baseline hazard estimates are changing over time, but also affect the accuracy of the estimated parameters. The goal of this thesis therefore, is to develop Maximum Penalized Likelihood (MPL) methods for current status data with informative censoring based on gamma-shared frailty assumption. We intend to show how the MPL procedure can be employed to simultaneously estimate the regression parameters and the smooth baseline continuous hazard functions for Proportional Hazards Model, Semiparametric transformation Model and Clustered data under the proportional hazards model. The variances and the asymptotic properties of these MPL estimators will be studied. Simulations studies will also be conducted to ascertain the performance of our proposed MPL methods. Comparisons will also be made with some of the existing EM methods. For illustrative purposes, we will apply our methods to the data set on rodent tumorigenicity experiment example provided by National Toxicology Program (1998)
  • Item
    Implementation of Fourier Transform in Image Processing
    (2017-02-09) Ansah-Narh, T.
    For a realistic radio interferometer for flat-sky approximation, the basic observation is a set of complex visibilities, which is defined in 2D Fourier Transform (FT) as Radio Measurement Interferometer Measurement Equation (RIME). The Discrete Fourier Transform (DFT) is a specific form of Fourier analysis that is widely employed in signal processing and related fields to analyze frequencies contained in a sample signal to solve partial differential equations and to perform other operations such as convolutions. The presentation will focus on the implementation of the algorithm of Fast Fourier Transform (FFT) by converting a 2D image to the frequency domain and back to the image domain (Inverse FFT). The technique will then be related to RIME.
  • Item
    Bayesian and Multilevel Approaches to Modelling Road Traffic Fatalities in Ghana
    (2016-07-04) Hesse, C.A.; Iddi, S.
    The knowledge of accident rates in a country provides a useful tool for a comprehensive analysis of their causes as well as for their prevention. Prof. R. J. Smeed, in 1949, gave a regression model for estimating road traffic fatalities. The study gives a modified form of Smeed’s regression formula for estimating road traffic fatalities in Ghana. The modified regression model was found to be relatively more accurate form for estimating road traffic fatalities in Ghana, than Smeed’s formula. Bayesian model for predicting the annual regional distribution of the number of road traffic fatalities in Ghana is derived, using road traffic accident statistics data from the National Road Safety Commission, Ghana Statistical Service and Driver and Vehicle Licensing Authority. The data span 1991 to 2009. Since the parameters are assumed to vary across the various regions, they are considered to be random variables with probability distributions. The Markov chain Mont Carlo (MCMC) sampling techniques were used to draw samples from each of the posterior distribution, thereby determining the values of the unknown parameters for each region based on a given data. The study has shown that population and number of registered vehicles are predominant factors affecting road traffic fatalities in Ghana. The effect of other additional factors on road traffic fatality such as human (the driver, passenger and pedestrian), vehicle (its condition and maintenance), environmental/weather and nature of the road cannot be ruled out. Similar to the Bayesian model, where the regression parameters are considered as random variables, a Multilevel Random Coefficient (MRC) model for predicting road traffic fatalities in Ghana was also developed. In this model, the number of road traffic fatalities and the regional groups are conceptualized as a hierarchical system of road traffic fatalities and geographical regions of Ghana, with fatalities and regions defined at separate levels of this hierarchical system. Instead of estimating a separate regression equation for each of the 10 regions in Ghana, a multilevel regression analysis was applied to estimate the values of the regression coefficients for each region based on data given.
  • Item
    On statistics of extremes under random censoring
    (2016-04-07) Minkah, R.; Asiedu, L.
    In the field of statistics of extremes, the most common assumption is to consider that samples are independent and identically distributed or weakly dependent and stationary from a distribution function F. However, in most real life applications such as survival analysis and reliability data, observations are usually censored. For such data sets, the classical estimators in statistics of extremes need some modification to take into account censoring. Compared to the classical statistics of extremes, the case of censoring is fairly new and in its early stages of development. In this presentation, we propose some estimators of the Extreme Value Index and extreme quantiles when data is randomly censored. We compare the proposed estimators with an exhaustive list of estimators in the literature in a large simulation study. The results show that our estimators are more robust to censoring. In addition, the estimators provide better confidence intervals when there is heavy censoring and more intermediate observations are included in the estimation.
  • Item
    Comparison of Least Squares Method and Bayesian with Multivariate Normal Prior in Estimating Multiple Regression Parameters
    (2016-02-17) Mettle, F.O.; Iddi, S.
    Based on an assumption of multivariate normal priors for parameters of multivariate regression model, this study outlines an algorithm for application of traditional Bayesian method to estimate regression parameters. From a given set of data, a Jackknife sample of least squares regression coefficient estimates are obtained and used to derive estimates of the mean vector and covariance matrix of the assumed multivariate normal prior distribution of the regression parameters. Driven to determine whether Bayesian methods to multivariate regression parameter estimation present a stable and consistent improvement over classical regression modeling or not, the study results indicate that the Bayesian method and Least Squares Method (LSM) produced almost the same estimates for the regression parameters and coefficient of determination with the Bayesian method having smaller standard errors.
  • Item
    Coloured Random Graphs, Simple Random Allocation of Coloured Balls into Coloured Bins
    (2016-03-17) Doku-Amponsah, K.
    Simple coloured random allocation model is obtained when coloured balls are sequentially inserted at random into coloured bins. The empirical occupancy measure of the simple coloured random allocation model counts the number of bins of a given colour with a given number of balls of each colour. In this paper we show that, given the colours, both the empirical occupancy measure of simple coloured random allocation models and the empirical neighbourhood distribution of sparse coloured random graph models converge weakly to the product of independent Poisson distributions.
  • Item
    Exact Matrix Completion via Convex Optimization
    (2016-03-10) Lotsi, A.; Doku-Amponsah, K.
    We consider a problem of considerable practical interest: the recovery of a data matrix from a sampling of its entries. Suppose that we observe m entries selected uniformly at random from a matrix M. Can we complete the matrix and recover the entries that we have not seen? We show that one can perfectly recover most low-rank matrices from what appears to be an incomplete set of entries. We prove that if the number m of sampled entries obeys for some positive numerical constant C, then with very high probability, most n*n matrices of rank r can be perfectly recovered by solving a simple convex optimization problem. This program finds the matrix with minimum nuclear norm that fits the data. The condition above assumes that the rank is not too large. However, if one replaces the 1.2 exponent with 1.25, then the result holds for all values of the rank. Our results are connected with the recent literature on compressed sensing, and show that objects other than signals and images can be perfectly reconstructed from very limited information.
  • Item
    Modified zero-inflated negative binomial model with application to clinical trial in epileptic patients
    (2016-03-10) Iddi, S.
    Marginalized models are in great demand by many researchers in the life sciences, particularly in clinical trials studies, epidemiology, health-economics, surveys and many others, since they allow generalization of inference to the entire population under study. For count data, standard procedures such as the Poisson regression and negative binomial model provide population average inference for model parameters. However, occurrence of excess zero counts and lack of independence in empirical data have necessitated their extension to accommodate these phenomena. These extensions, though useful, complicate interpretations of fixed effects. For example, the zero-inflated Poisson model accounts for the presence of excess zeros, but the parameter estimates do not have a direct marginal inferential ability as the base model, the Poisson model. Marginalizations due to the presence of excess zeros are underdeveloped though demand for them is interestingly high. The aim of this talk, therefore, is to present a developed marginalized model for zero-inflated univariate count outcome in the presence of overdispersion. Emphasis is placed on methodological development, efficient estimation of model parameters, and application to a clinical trial dataset. Results of simulation study performed to assess the performance of the model are also discussed.