University of Ghana http://ugspace.ug.edu.gh UNIVERSITY OF GHANA MODELING LARGE INSURANCE CLAIMS USING EXTREME VALUE THEORY: A CASE STUDY OF THE 37 MILITARY HOSPITAL. By ABEKA COLLINS (10701682) THIS THESIS IS SUBMITTED TO THE UNIVERSITY OF GHANA, LEGON IN PARTIAL FULFILMENT OF THE REQUIREMENT FOR THE AWARD OF MASTER OF PHILOSOPHY IN STATISTICS DEGREE. NOVEMBER, 2020 DECLARATION This is the result of research work undertaken by Abeka Collins under the supervision of Dr. Richard Minkah and Dr. Ezekiel Nii Noye Nortey. I declare that the entity of the work contained therein is my own work, that I am the authorship owner thereof (unless to the extent explicitly otherwise stated) and I have not previously in its entirety or in part submitted it for obtaining any qualification. Abeka Collins (10701682) Signed: .................... Date: .................... (Student) Dr. Richard Minkah Signed: .................... Date: .................... (Principal Supervisor) Dr. Ezekiel N. N. Nortey. Signed: .................... Date: .................. (Co Supervisor) 1 University of Ghana http://ugspace.ug.edu.gh R-Minkah Typewritten text 07/06/2022 R-Minkah Typewritten text 07-06-2022 R-Minkah Typewritten text 07-06-2022 ABSTRACT The private health insurance industry is one of the vital components in nation-building. It complements government’s efforts in reducing “out-of-pocket” payment for healthcare services in the country. However, some private health insurance companies face severe insolvency issues due to accumulation of unanticipated huge claim amounts. The Extreme Value Theory (EVT) is a statistical tool proven to help solve or mitigate some of these challenges since it focuses mainly on the behaviour of severe but rare occurrence. In this study, we employ the EVT approaches to model large insurance claims from the 37 Military hospital; and to estimate financial risk indicators such as Value-at-Risk (VaR) and Expected Shortfall (ES) among other extreme quantiles. Conclusions drawn from analysis established that the Weibull class of distributions is more appropriate for the data at hand and for this reason, it is not likely for the 37 Military hospital to submit claim amount exceeding 24,618 cedis for any given day. In addition, private health insurance firms can be assured at a confidence level of 99%, 99.5% and 99.9% that within a day, the hospital is not likely to submit a claim amount exceeding 2,910 cedis, 3,938 cedis and 7,946 cedis respectively. Finally, it was recommended that the NHIA could replicate this study using the claims received by the public health insurance scheme (i.e. NHIS) since it can go a long way to strengthen the financial sustainability of the scheme. ii University of Ghana http://ugspace.ug.edu.gh DEDICATION This thesis is dedicated to my lovely wife, Mrs. Valentina Abeka-Arthur, to an angel of a daughter, Annette Aseda Abeka-Arthur, and to my caring parents, Mr. and Mrs. Abeka-Arthur. iii University of Ghana http://ugspace.ug.edu.gh ACKNOWLEDGEMENT My foremost appreciation goes to Jehovah God Almighty for His grace and mercies in my life and for the strength He gave me to accomplish this thesis. My deepest gratitude goes to my supervisors, Dr. Richard Minkah and Dr. Ezekiel N.N. Nortey, for their immense time, support and guidance towards the completion of this thesis. I also recognise Dr. Samuel Iddi, Mr. Stephen Nkrumah, Mr. Eric Ocran and Emmanuel Kojo Aidoo of the Statistics and Actuarial Science Department of the University of Ghana for their various ways of assisting me to complete this thesis. I acknowledge Major Richard Osei-Boateng, Officer-In-Charge of the Health Information System Department and Mr. Isaac Addo, my immediate superior, at the 37 Military Hospital for their support and motivation in combing office duties and the writing of this thesis. Next in line are Lt. Emmanuel Dakubu, Mrs. Mary Adu-Brempong, Mr. Ebenezer Nortey and the entire Claims Unit of the 37 Military Hospital. I thank them for their diverse support in completion of this thesis. Many more thanks to my parents Mr. and Mrs. Abeka-Arthur, to my siblings, to my friend and confidant, Pastor Bismark Okai, for their extreme support in my life and also in completion of this thesis. Last and very important is my ‘God-sent’ wife, Mrs. Valentina Abeka-Arthur. I appreciate and thank her for being the “wind in my sail”. May God richly bless you all. iv University of Ghana http://ugspace.ug.edu.gh Contents DECLARATION i ABSTRACT ii DEDICATION iii ACKNOWLEDGEMENT iv TABLE OF CONTENT v LIST OF FIGURES vii LIST OF TABLES x LIST OF ABBREVIATIONS xiv 1 INTRODUCTION 1 1.1 Background of Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3 Research Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Significance of Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.5 Contributions of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.6 Outline of the Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 v University of Ghana http://ugspace.ug.edu.gh 2 LITERATURE REVIEW 11 2.1 The Origin and Advancement of EVT . . . . . . . . . . . . . . . . . . . . . . 11 2.2 Applications of EVT and Scholarly Contributions . . . . . . . . . . . . . . . 13 3 METHODOLOGY 19 3.1 Convergence of Maxima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Maximum Domain of Attraction . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.1 The Gumbel Family of Distributions . . . . . . . . . . . . . . . . . . 21 3.2.2 The Fréchet Family of Distributions . . . . . . . . . . . . . . . . . . . 22 3.2.3 The Weibull Family of Distributions . . . . . . . . . . . . . . . . . . 22 3.2.4 The Block Maxima Method (BMM) . . . . . . . . . . . . . . . . . . . 23 3.2.5 The Generalised Extreme Value Distribution (GEVD) . . . . . . . . . 24 3.2.6 Estimation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . 27 3.2.7 Estimation of Extreme Quantiles . . . . . . . . . . . . . . . . . . . . 31 3.3 Peaks-Over-Threshold Method (POTM) . . . . . . . . . . . . . . . . . . . . 34 3.3.1 Selection of Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . 35 3.3.2 The Generalised Pareto Distribution (GPD) . . . . . . . . . . . . . . 39 3.3.3 Estimation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . 40 3.3.4 Estimation of Extreme Quantiles . . . . . . . . . . . . . . . . . . . . 42 3.4 Value-at-Risk (VaR) and Expected Shortfall (ES) . . . . . . . . . . . . . . . 43 3.5 Block Maxima Method (BMM) Vs. Peaks-Over-Threshold Method (POTM) 45 4 DATA ANALYSIS AND FINDINGS 47 4.1 Preliminary Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2 The Block Maxima Method (BMM) . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.1 Estimation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . 54 4.2.2 Estimation of Extreme Quantiles . . . . . . . . . . . . . . . . . . . . 57 4.3 Peaks-Over-Threshold Method (POTM) . . . . . . . . . . . . . . . . . . . . 58 vi University of Ghana http://ugspace.ug.edu.gh 4.3.1 Threshold Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 4.3.2 Estimation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . 63 4.3.3 Estimation of Extreme Quantiles . . . . . . . . . . . . . . . . . . . . 66 5 CONCLUSIONS AND REMARKS 69 5.1 RECOMMENDATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . 70 APPENDIX A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 APPENDIX B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 vii University of Ghana http://ugspace.ug.edu.gh List of Figures 4.1.1 Scatter Plot and Histogram of Claims Submitted for the Period 2012 - 2019. . . . . . . . . . 50 4.1.2 Exponential QQ-Plot of Claims . . . . . . 52 4.2.1 Diagnostic Plots for the Proposed GEV Model . . . . . . . . . . . . . . . . . . . . . 54 4.3.1 Mean Excess Plot of Claims Submitted, with Gray Lines Indicating 95% Confidence Intervals for the Mean Excess Values . . . 60 4.3.2 Shape Parameter Stability Plot . . . . . . 61 4.3.3 Scale Parameter Stability Plot . . . . . . . 61 4.3.4 Shape Parameter Stability Plot (zoomed) 63 4.3.5 Scale Parameter Stability Plot (zoomed) . 63 viii University of Ghana http://ugspace.ug.edu.gh 4.3.6 Diagnostic Plots for the GPD-fit at Threshold of 30 . . . . . . . . . . . . . . . . . . . . . . 65 5.1.1 Diagnostic Plots for the GEVD-fit After Log-Transformation . . . . . . . . . . . . . 78 5.1.2 Diagnostic Plots for the Prospective Gumbel-fit 78 5.1.3 Diagnostic Plots for the GPD-fit at Threshold of 25 . . . . . . . . . . . . . . . . . . . . . . 80 5.1.4 Diagnostic Plots for the GPD-fit at Threshold of 26 . . . . . . . . . . . . . . . . . . . . . . 81 5.1.5 Diagnostic Plots for the GPD-fit at Threshold of 27 . . . . . . . . . . . . . . . . . . . . . . 81 5.1.6 Diagnostic Plots for the GPD-fit at Threshold of 28 . . . . . . . . . . . . . . . . . . . . . . 82 5.1.7 Diagnostic Plots for the GPD-fit at Threshold of 29 . . . . . . . . . . . . . . . . . . . . . . 82 ix University of Ghana http://ugspace.ug.edu.gh 5.1.8 Diagnostic Plots for the GPD-fit at Threshold of 30 . . . . . . . . . . . . . . . . . . . . . . 83 x University of Ghana http://ugspace.ug.edu.gh List of Tables 4.1.1 Summary Statistics . . . . . . . . . . . . . 48 4.1.2 ADF Stationarity Test for Claim Amounts 50 4.2.1 Parameter Estimates For The GEV Model 55 4.2.2 Test for the Validity of the Gumbel Model 56 4.2.3 Model Comparison of the “Gumbel-class” vs. The “Weibull-class” . . . . . . . . . . . 56 4.2.4 Return Period, Return Level and Exceedance Probability Estimates . . . . . . . . . . . . 57 4.2.5 Loss Mitigating Tools: 1-day-VaR Estimates . . . . . . . . . . . . . . . . . . . . . . . . . 58 4.3.1 Estimates of Model Parameters and Upper Endpoints for the Prospective Thresholds 64 xi University of Ghana http://ugspace.ug.edu.gh 4.3.2 Negative Log-likelihood, AIC and BIC scores for the Potential GPD Models . . . . . . . 66 4.3.3 Return Period, Return Level and Exceedance Probability Estimates . . . . . . . . . . . . 67 4.3.4 Loss Mitigating Tools: 1-day-VaR and 1-day-ES Estimates . . . . . . . . . . . . . . . . . . . 68 5.1.1 GEV Parameter Estimates For The Claim Maxima . . . . . . . . . . . . . . . . . . . . 77 5.1.2 Parameter Estimates After Variable Transformation 77 5.1.3 Return Period, Return Level and Exceedance Probability Estimates . . . . . . . . . . . . 79 5.1.4 Loss Mitigating Tools: 1-day-VaR and 1-day-ES Estimates . . . . . . . . . . . . . . . . . . . 79 5.1.5 Potential Thresholds And The PWMParameter Estimates . . . . . . . . . . . . . . . . . . . 80 5.1.6 Return Period, Return Level and Exceedance Probability Estimates . . . . . . . . . . . . 83 xii University of Ghana http://ugspace.ug.edu.gh 5.1.7 Loss Mitigating Tools: 1-day-VaR and 1-day-ES Estimates . . . . . . . . . . . . . . . . . . . 84 xiii University of Ghana http://ugspace.ug.edu.gh LIST OF ABBREVIATIONS BMM Block Maxima Method ES Expected Shortfall EVD Extreme Value Distribution EVI Extreme Value Index EVT Extreme Value Theory GEV Generalised Extreme Value GEVD Generalised Extreme Value Distribution ML Maximum Likelihood MLE Maximum Likelihood Estimation (or Estimates) PL-CI Profile Likelihood-based Confidence Interval POTM Peak Over Threshold Method PSP Parameter Stability Plot PWM Probability Weighted Moments PWME Probability Weighted Moments Estimation (or Estimates) VaR Value-at-Risk xiv University of Ghana http://ugspace.ug.edu.gh Chapter 1 INTRODUCTION Classical data analysis is mostly concerned with the characteristics of the centre of a distribution for a data set. It usually deals slightly with outliers (extrema) and assumes the normal distribution as a suitable asymptotic fit for the data, given a sufficiently large sample size even when the individual observations may not be normally distributed. Thus, little or sometimes no attention is given to the tail behaviour of the distribution and as a result, the model formulated to fit the data mathematically underestimates statistical inferences for rare but severe observations. Consequently, room is given to inadequate preparedness in preventing or mitigating damages that may arise from the occurrence of such extreme events. Taking the health insurance industry in Ghana for instance, the presence of high and increasing claim cost have placed the potency of the country’s public (national) health insurance scheme under severe financial pressure; and this has contributed to the scheme’s inability to pay claims on time to healthcare providers (NHIS Review, 2020). As a result of the scheme’s indebtedness, some of the hospitals end up demanding “out-of-pocket” payment from insured patients for medical services provided (Donkor, 2014; Ghanaweb Network, 2015). Consequently, an increase in the financial burden on healthcare can be expected from the residents of the country. Another example of the catastrophic effect of severe 1 University of Ghana http://ugspace.ug.edu.gh but rare observations can be seen in St. Helens Insurance company limited, in the United Kingdom (Massey, 2003). Massey narrated that the St. Helens insurance company ceased underwriting after incurring large losses from Hurrican Bestsy in 1965, and eventually folded up in 1967 mainly as a result of huge unanticipated claims and insufficient reinsurance. Other tragic effects of extreme happenings include: increased rate in Cerebrospinal Meningitis (CSM) infection due to occurrence of high temperature levels in dry seasons (Codjoe and Nabie, 2014); destruction of infrastructure and farms by floods due to presence of high rainfall levels (Nkrumah, 2017); and the Global Financial Crisis caused by the ripple effect of extremely low prices in the US housing market that led to high mortgage loan defaulters, which also resulted into insolvency by banks and investors who gave out the loans (Reserve Bank of Australia, 2018). Considering the damages left behind by the occurrence of extreme realisations, it becomes more appropriate to separately model the tails of a distribution when accurate extrapolations of maxima or minima observations is of concern; and this is where Extreme Value Theory (EVT) comes to ‘play’. EVT is a statistical framework that explicitly deals with methods for modelling and inference of severe but infrequent events (Nortey et al., 2017). It focuses its attention on the behaviour of a distribution’s tails by examining very high and very low observations in a data set (Minkah, 2016). The applicability of EVT spans from hydrology (Minkah, 2016; Nortey et al., 2017) , climatology (Nkrumah, 2017), finance (Danielsson and De Vries, 2000, Allen et al., 2011; Uppal, 2013; Nortey et al., 2015), insurance (Embrechts et al., 2013, Pérez-Fructuoso and Garćıa Pérez, 2010; Adesina et al., 2016; Weru and Waititu, 2019) and sports (Sérgio, 2012). 2 University of Ghana http://ugspace.ug.edu.gh The concern of this study is to employ the EVT methods to assess and infer the behaviour of large insurance claims in the private health insurance industry using the 37 Military Hospital as a case study. 1.1 Background of Study Health insurance is the protection a person secures against financial detriment as a result of an untimely illness that is capable of causing monetary loss to the person. Boadu et al. (2014) records that the insurance system in Ghana started in 1924 with Enterprise Insurance Company Limited (formerly, Royal Guardian Enterprise). He continued that the insurance system is divided into life insurance, non-life insurance, and composite insurance (a combination of life and non-life insurance). In Ghana, the National Insurance Commission (NIC) is the sole institution that has the authority to regulate and supervise insurance activities in the country under the Insurance Act, 2006 (Act 724) of the 1992 constitution of Ghana (National Insurance Commission, 2019). Currently, there are 142 regulated insurance entities made up of 24 life insurance companies, 29 non-life insurance companies, 3 reinsurance companies and 85 insurance brokers and loss adjusters (National Health Insurance Authority, 2020). Over the years, one component of the non-life insurance that has seen a facelift by successive governments in the country is the health insurance. Though with different approaches, the implementation of the Health Insurance Scheme is one which all the major political parties have openly embraced in line with the country’s pursuance of reducing financial hardship in accessing healthcare delivery. Prior to the first republic, healthcare financing was predominantly by “out-of-pocket” payments at the point of service (Arhinful, 2003). It then became almost free as with other social services during the first republic and later saw a “turn-around” with the “cash and carry” system following the overthrow of the 3 University of Ghana http://ugspace.ug.edu.gh then government (Wagstaff, 2010). The “cash-and-carry” system of healthcare financing requires patients to pay part or full amount of the cost of health services in order to reduce government’s finances in the health sector (Owusu-Sekyere and Chiaraah, 2014). However, the “cash-and-carry” system became highly ostracized among the citizens and was eventually replaced by the National Health Insurance Scheme (NHIS) - a social intervention by government to reduce the cost of health for her residents (National Health Insurance Scheme, 2020). National Health Insurance Authority (2020) states that the NHIS was established under the National Health Insurance Act, Act 650 in 2003, and later amended in 2012 by Act 852. It continued that the NHIS was instituted by the government of Ghana to finance basic healthcare services to residents in the country through public and private health insurance schemes. The National (public) Health Insurance Scheme (NHIS) is managed and largely funded by the government through tax revenues, statutory deductions (such as the National Health Insurance Levy (NHIL) and SSNIT contributions), and other government subventions. It is also funded by premiums - an agreed amount borne by an insured for a contract of insurance - from informal sector subscribers (National Health Insurance Authority, 2020). The NHIS benefit package covers about 95 percent of diseases in the country, and is accepted in all government health facilities and some accredited private healthcare centres (National Health Insurance Scheme, 2020). The Private Health Insurance Scheme (PHIS), on the other hand, offers financial protection on a wider range of health issues and allows coverage of certain healthcare services that are excluded under the public health insurance scheme. Examples include VIP ward accommodation, photography, elective heart and brain surgeries, dialysis treatment, and treatments overseas. National Health Insurance Authority (2018) asserts that there are two types of private health 4 University of Ghana http://ugspace.ug.edu.gh insurance schemes in Ghana - Private Mutual Health Insurance Scheme (PMHIS) and Private Commercial Health Insurance Scheme (PCHIS). The former operates exclusively for the benefit of its members and can be formed by any identifiable group of persons in the country; while the latter is a limited liability insurance company that is operated on a profit motive and is purely funded by premiums paid by its subscribers. Currently, there are about 14 companies licensed by the NHIA to operate as Private Commercial Health Insurance Scheme (PCHIS) (National Health Insurance Scheme, 2020). Notably among them are Nationwide Medical Insurance Company Limited, Premier Health Insurance Company Limited, Acacia Health Insurance Company and Glico Healthcare Company Limited. The health insurance system comprises three key stakeholders - the insurer (insurance company), the insured (subscriber), and the service provider (healthcare facility) whose role is crucial in the overall quest for sustainable delivery of quality and accessible healthcare service. In Ghana, the Health Facility Regulatory Agency (HEFRA) in collaboration with the National Health Insurance Authority (NHIA) are responsible for respectively supervising and licensing healthcare facilities that wish to avail their services to the country’s health insurance industry. Aside pharmaceutical and government facilities, there are about 1,370 healthcare facilities licensed to provide services in the country as at March 2020 (Health Facilities Regulatory Agency, 2020). According to Ministry of Health (2016), categories of healthcare facilities that can be credentialed by the NHIA to provide services to residents include licensed chemical shops, diagnostic centres, Community-based Health Planning and Services (CHPS), maternity homes, clinics, and hospitals. The hospital is one of the health facilities that provides a broad range of healthcare services to a wider populace. Ghana’s health system classifies various hospitals on the basis of management and administration to be a private hospital, a government hospital, or a quasi-government hospital. In addition, the hospitals can also be grouped into 5 University of Ghana http://ugspace.ug.edu.gh primary, secondary, or tertiary based on the level of healthcare delivery. One such hospital that has proven to be of significant value in healthcare delivery to the country is the 37 Military Hospital. An interview at the 37 Military Hospital with Major Richard Osei-Boateng (Officer-In-Charge of the Health Information System Department) and Mr. Isaac Addo (Officer-In-Charge of the Health Insurance Unit) on same accounts attest that the 37 military hospital was initially instituted in July 1941 to provide services for troops injured in the second world war. In addition, it is classified as a tertiary quasi-government hospital established by the Ghana Armed Forces to primarily provide healthcare to its serving and ex-service personnel and their families, to civilian employees of the Ministry of Defence and their families, and also to the general public as a referral healthcare facility. The 37 Military hospital is made up of several departments - surgical, medical, obstetrics, gynaecology, paediatrics, dental, physiotherapy, radiology, pathology, and a chemist department. In addition to being a teaching hospital, the 37 Military Hospital serves as a National Disaster and Emergency Response health facility and also comes to the rescue in times of industrial strikes by government healthcare professionals (Tv3 Network, 2015). The administration of the hospital is headed by the Commander of the unit and assisted by the Commanding Officer and other high-ranking military officers. Healthcare financing at the 37 Military Hospital by the general public is either through “out-of-pocket payment” (cash or cheque), through insurance (NHIS or accredited PHIS), or by the social welfare organisation. 6 University of Ghana http://ugspace.ug.edu.gh 1.2 Problem Statement One of the targets of universal health coverage is to ensure that people obtain the healthcare services without suffering financial hardship when paying for them (De Wolf and Toebes, 2016). The private health insurance industry plays an important role in achieving this goal since they complement government’s efforts in financing healthcare services and thus reduce “out-of-pocket” payment for healthcare needs. However, the sustainability of healthcare financing through insurance is sometimes very challenging for health insurance companies especially when vetted claim amounts are very large and are not easily anticipated due to their rare occurrence. As a result, health insurance companies are likely to find themselves in insolvency issues when these claim amounts accumulate and remains unpaid. The additional effect is for healthcare providers to suspend their services on insurance basis and revert to the “cash and carry” system of healthcare delivery. Based on the records from the private health insurance unit of the 37 Military Hospital, the following were brought to light: • In the last quarter of 2016, the facility suspended services to four partnered health insurance companies - Med-X Health Systems Limited, First Fidelity Health Insurance, Oval health insurance and Empire Health Insurance Limited; mainly as a consequence of high indebtedness to the hospital. • In 2018, services to companies like GN Medicals Limited and Beige Healthcare Limited were suspended as a result of insolvency issues. • There were records of long-overdue unpaid claims amounts which in the long run can bring a possible reversion to the “cash-and-carry“ system for the general public. In light of this, the researcher employs the extreme value theory framework to model and predict the behaviour of very large claim sizes that are likely to occur when dealing with the 7 University of Ghana http://ugspace.ug.edu.gh 37 Military Hospital. This we believe could help the health insurance companies to position themselves better on financing medical care for their clients at the hospital. as the above mentioned problems are curtailed. 1.3 Research Objectives • To identify and fit an appropriate limiting distribution to describe the tail behaviour of the distribution of large health insurance claims. • To approximate the value of the maximum possible claim amount that can be generated by the hospital in a day. • To forecast with a given probability the claims amount that could be exceeded once every 5, 10, 20, 50 and 100 months respectively. • To project with some level of confidence some extreme claims that could not be exceeded in the next 24 hours. 1.4 Significance of Study Universal health coverage (UHC) ensures that people have access to healthcare services whilst being protected from more than 30% health expenditure on their household income (Akinola and Dessislava, 2019). The World Health Organization (WHO) reports that though governments are spending more on health, residents are still paying much out of their own pocket (World Health Organization, 2019). This brings to light the need and sustainability of health insurance schemes that is capable of providing substantive financial protection against health expenditure. As part of the prudent measures in ensuring financial soundness in the sustainability of these schemes, it is important for health insurance companies to be able to anticipate the likelihood of receiving very large claims that could lead to large losses 8 University of Ghana http://ugspace.ug.edu.gh and eventually cause financial instability or even disrupt their operations with healthcare facilities. In this regard, the study is significant in the following ways: • The study will help private health insurance companies to better predict the likelihood of large insurance claims and the frequency with which they occur when it comes to dealing with the 37 Military Hospital. • The study will highly aid insurance companies in the calculation of premium volumes needed to cover future huge claims for their clients who visit the 37 Military hospital. • The goodwill between the hospital and the insurance companies will be boosted since companies will be able to pay claims on time due to adequate liquidity reserves and reinsurance measures. • The study will also benefit the hospital by serving as a working document for future partnership agreement with any health insurance firm. • The study will in the long run help to prevent the need for “out-of-pocket” payment of healthcare delivery due to the readiness of health insurance companies to reimbursed the hospital. • The replication of the study in other healthcare facilities or insurance firms across the country can help ensure that the health insurance sector is stronger and well positioned to serve the growing demands of the public. 1.5 Contributions of the Study The main contributions of this thesis are as follows: • Academically, the study will contribute to present literature as a ghanaian perspective on the practice of EVT in the insurance industry, particularly, the private health insurance industry. 9 University of Ghana http://ugspace.ug.edu.gh • Financially, the 37 Military hospital stands to gain immensely from this work as it is able to anticipate the maximum possible revenue it can generate from its partnered private health insurance companies at a particular point in time. • Findings of this paper will greatly contribute to the dealings of the Private Health Insurance Association of Ghana (PHIAG) with healthcare facilities across the country as huge claims amount are easily anticipated and later indemnified in the periods they occur. • In terms of integrity, both private health insurance companies and healthcare facilities can use this work to investigate certain large claim amounts that occurs in unexpected time periods. 1.6 Outline of the Study Chapter 1 of the study captures the motive for the research and introduces the statistical tool of interest for the study. It further provides a background investigation of the study and identifies the problems the study is meant to resolve. It then captures the specific goals of the study, its significance, and contribution to both literature and stakeholders involved. Chapter 2 unfolds the origin and advancement of the Extreme Value Theory (EVT) and also peruses some of the scholarly contributions and diverse applications of EVT in real life. Chapter 3 is centred on the EVT methodology and discusses the parametric EVT- approaches that can be employed to describe the characteristics of extreme occurrences. Chapter 4 captures the data analysis and gives the description of the data at hand. This chapter also covers the GEV model and the GP model that were used to describe the large claims through the estimation of parameters and extreme quantiles under both MLE and PWME. Chapter 5 draws conclusions from the findings obtained in Chapter 4, and then makes some recommendations for further research. 10 University of Ghana http://ugspace.ug.edu.gh Chapter 2 LITERATURE REVIEW 2.1 The Origin and Advancement of EVT Extreme Value Theory (EVT) is that field of science that deals with the extraordinary deviations from the mean of a probability distribution (Allen et al., 2011). In particular, extreme value analysis focuses on the estimation and probabilities of events that are greatly different than any ever observed in samples of some specified size. (Coles, 2001). The origin of the study of extreme value statistics, as stated by Kinnison (1985), begun with the early astronomers who were confronted with either to use or exclude observations that seemed to be very distant from the rest in a data set. He added that though these astronomers were able to specify the problem of extreme deviations in statistics, mathematical tools available were too crude to solve it. Kinnison further stated that, in 1922, L. von Bortkiewicz founded good numerical approximations which indicated that the sample maxima from a Gaussian population becomes new variables with separate distributions. He acknowledged that Bortkiewicz was the first to clearly state the extreme value problem in statistical terms. Kinnison continued that, in the subsequent year, von Mises introduced the mathematical concept of the expected largest value in a Gaussian sample; and this became the history on 11 University of Ghana http://ugspace.ug.edu.gh the study of asymptotic distribution of extreme values in samples from Gaussian distributions. The advancement of EVT, as narrated by Sérgio (2012), started in 1925 when Leonard Tippett studied material resistance by examining the strength of wool for the cotton industry in Britain. Tippet discovered that the strength of wool largely relied on the strength of the weakest fibres and not the average fibre. Sergio added that with the help of Sir Ronald Fisher, Tippet created a probabilistic theory by means of tables of large values and founded corresponding probabilities to serve as a framework to be applied for sample maxima from a normal distribution. Kinnison (1985), again narrated that Fisher and Tippet published the paper which is now known as the foundation of the asymptotic theory of extreme value distributions and consequently suggested the three domains of attraction - Gumbel, Fréchet, and Weibull domains of attraction - to characterise the distribution of sample extremes. Fisher and Tippet also indicated that there is an extremely slow convergence of the distribution of a gaussian sample maxima towards its asymptotic distribution. Kinnison (1985) recorded that von Mises in 1936, made an improvement by establishing some simple and sufficient conditions for the weak convergence of the largest order statistic to each of the three family of distributions. However, Boris Vladimirovich Gnedenko in 1943, formalized all the previous knowledge about extremes and provided a much more rigorous foundation for the necessary and sufficient conditions for the weak convergence of the extreme order statistics; and this is known as the first theorem (i.e. the Fisher-Tippett-Gnedenko theorem) of Extreme Value Theory (Sérgio, 2012). Later on, an extensive bibliography of the literature and early development of statistical analysis of extremes from a theoretical and practical point of view was presented by Emil Julius Gumbel in 1958, in his book “Statistics of Extreme” (Kinnison, 1985). Since then, EVT has found its way in explaining the underlying mechanisms in many areas were the behaviour of extreme happenings is of utmost concern. 12 University of Ghana http://ugspace.ug.edu.gh 2.2 Applications of EVT and Scholarly Contributions In Ghana, reference can be made to the application of EVT in modelling climatological and hydrological data. Nkrumah (2017) applied the EVT in estimating the occurrence of extreme temperature and extreme rainfall in Ghana for the period January 1960 to December 2012 using data from some selected regions of the country. Employing both the Generalized Extreme Value (GEV) model and the Generalized Pareto (GP) model, he found that the extreme occurrences of temperature and rainfall can be modelled using Fréchet and Weibull classes of distributions respectively. This allowed him to predict temperature maxima of 34.7, 34.6, and 39.6 degree Celsius, to be anticipated in Accra, Ashanti and Northern regions respectively, once every five years. He also expected an amount of 88.38mm, 85.61mm, and 81.98mm rainfall levels in Accra, Ashanti, and Northern regions respectively, once every 5 years. Also, Minkah (2016) and Nortey et al. (2017), both studying the water levels of the Akosombo dam in Ghana, made use of EVT methods in their analyses of the dam’s critical water levels. By means of the Generalized Pareto Distribution (GPD), Minkah (2016) estimated the probability and return period of very low water-levels that can lead to a complete shutdown of the dam’s operations and very high water-levels that leads to spillage of excess water and cause flooding. While he found a negligible probability of a complete shutdown of the dam - since the left endpoint estimate is greater than the dam’s critical level of 226ft - he also established the need for extension of the dam in order to reduce the probability of a flood occurring, or to even prolong the return period of a flood - since the right endpoint estimate was greater than the dam’s maximum operating level of 278ft. Employing a different approach of the EVT, Nortey et al. (2017) arrived at the same results by using the 13 University of Ghana http://ugspace.ug.edu.gh Generalized Extreme Value Distribution (GEVD) to study the tail behaviour of the dam’s water level. However, they discovered evidence of the water level falling below the dam’s minimum operation level in the future. In the aspect of risk-management strategies in the financial industry, Nortey et al. (2015) modelled the outlier values of the Ghana Stock Exchange all-shares index by applying the conditional approach of the EVT to model the tails of the daily stock returns data. Applying the Peaks-Over-Threshold method, they fitted an ARMA-GARCH model to the data to correct for the effects of autocorrelation and conditional heteroscedastic terms present in the data. They also employed the Maximum Likelihood (ML) method to estimate the model’s parameters. They established that the EVT model provided a better fit to the tail distribution of the returns because the data set contained more severe observations. Moreover, the observed volatility in the daily returns data justified the conditional EVT approach as the preferred statistical technique for the study. In addition, the Maximum Likelihood (ML) estimates for the parameters of the GPD were also found to be associated with lower standard errors as compared to the Probability Weighted Moment (PWM) estimates when applied to the large data set. In the end, their findings established the GPD as a best fit for dealing with behaviour of extreme share index in the stock market. Elsewhere around the world, we see the employment of EVT in decisions of selection and pricing of portfolio. Danielsson and De Vries (2000) proposed a semi-parametric EVT-based method for Value-at-Risk (VaR) estimation for a portfolio of stocks. They stated that while there exist several methods for VaR estimation - some based on the use of conditional volatilities like the RiskMetrics method and others relying on unconditional historical simulation method; the EVT-based VaR for estimation of tail probabilities performs better. They illustrated that the GARCH-based RiskMetrics method underpredicts VaR and the Historical simulation also suffers from a high variance and discrete sampling in the tails of the distribution. 14 University of Ghana http://ugspace.ug.edu.gh In addition, the historical simulation is not able to capture losses outside the sample it analyses. They concluded that the reason for the improved performance of the extreme value method is in its combination of some of the advantages of the other competing methods. Gencay and Selcuk (2004) also arrived at the same conclusion that, the dynamic-EVT based approach for calculating VaR gave more accurate forecast than other competing approach and also has the advantage of reacting to changing market conditions. Again, by using the EVT framework, Allen et al. (2011) computed Value-at-Risk (VaR) measure in order to accurately assess the risk of severe losses on investment, especially when the distribution of negative returns is characterized by heavy realisations that sometimes fall outside the bounds of a normal distribution. In a similar vein, Uppal (2013) examined the performance of market risk measures based on Dynamic EVT to estimate the probabilities of large deviations of returns in financial markets for five developed and five emerging markets before and during the Global Financial Crisis (GFC). In his work, he established that, the Value at Risk (VaR) measure based on the Dynamic EVT procedure performed best among the other VaR estimation methods. He further indicated that the GPD model fits the observed distribution of extreme values well, in both pre-crisis and the crisis periods, with the exception of the US markets where the crisis originated from. Another field where EVT can “play” is sports science. Sérgio (2012) applied the EVT to estimate the maximal amount of oxygen needed to break records set by top athletes. Using both the parametric and semi-parametric EVT approaches, he examined the probability of exceeding Usain Bolt’s minimum 9.58 seconds speed in the 100-meters race and the (96 ml-kg-min) maximum oxygen uptake for speed by Bjorn Daehlie and ESpen Harald Bjerke - world’s fastest snow-skiers. In the maximum oxygen (V 02max) snow-skiing case analysis, he indicated that the semi-parametric approach rendered a more concordant findings with those obtained under the Peaks-Over-Threshold (POT) method when the Probability Weighted 15 University of Ghana http://ugspace.ug.edu.gh Moment (PWM) method is used for parameter estimation. He also demonstrated that the estimates for the right endpoint of the distribution functions are asymptotic to the sample maximum. In other words, the record of Bjorn Daehie and Espen Harald Bjerke can hardly be broken. On the other hand, in the analysis involving the 100-meters race, the semi-parametric approach yielded similar results as that of the POT method when the Maximum Likelihood (ML) method is used for parameter estimation. Furthermore, the analysis revealed that, in the present circumstances, a 100-meters-race athlete can improve his speed, with a likelihood of breaking the current record of Usain Bolt under 9 seconds. EVT procedures have also be utilised in the insurance industry to provide appropriate tools to mitigate losses (Embrechts et al., 2013). Pérez-Fructuoso and Garćıa Pérez (2010) demonstrated one of the performances of EVT in the Spanish motor liability insurance industry. They employed the EVT framework to perform a parametric estimation by fitting loss severities to the Generalized Pareto Distribution. They illustrated how EVT improves classicalist’s adjustments by going beyond the average to consider outlier behaviour. Furthermore, they demonstrated that adjustments with EVT-based distributions in modelling the tails of loss severities significantly improve tail-distribution inference. They concluded that the EVT deals well with behaviour of extreme events and allows both the insurer and reinsurer to evaluate the expectation of losses over a certain level, and the premiums sufficient to cover such losses. In Africa, particularly Kenya and Nigeria, mention can be made of Wainaina and Waititu (2014), Adesina et al. (2016), and Weru and Waititu (2019) who applied the EVT to the insurance sector of their economy. Wainaina and Waititu (2014) employed the extreme value theory approach to model huge losses arising from Kenya’s fire insurance industry. Empirical analysis carried out by them revealed that the GPD model is the best fit to the returns since the distribution of returns was heavy-tailed. In addition, they performed an EVT-based 16 University of Ghana http://ugspace.ug.edu.gh quantile estimation of the distribution of excesses over a seemly high threshold at different confidence intervals. Their results demonstrated how well the EVT applies to the insurance industry in terms of dealing with large but infrequent claims. In a related work, Weru and Waititu (2019) also centred his study on the tail behaviour of claims in the motor insurance sector using Kenindia Assurance Company limited. He employed the EVT framework as a study practise to help in determining company’s future maximum loss and the frequency with which they occur. He fitted a GPD model to large claims above a certain threshold, and estimated risk measures such as Value-At-Risk and Expected Shortfall based on extreme value analysis. He then compared the results to other competing methods of VaR estimation, and concluded that though the other methods of estimation perform well, the EVT approach to modelling seems to perform better with fat-tailed data. A replicated study of Weru and Waititu (2019) was carried out by Adesina et al. (2016) in Nigeria. Adesina et al. (2016) estimated the Value-at-Risk quantile by using the EVT methods to model large losses of Nigeria’s motor insurance business. They chose the GPD approach over the GEV approach because of their interest in the overall behaviour of the tail area of the distribution of returns, rather than just the maxima loss in each year. They also compared EVT-based VaR estimates to VaR based on Historical and Gaussian methods at 5% confidence interval. and concluded that the EVT-based VaR is most suitable to calculate VaR as long as outlier behaviour of a dataset is of concern. Coles (2001) listed a few other areas where extreme value analysis have been applied: alloy strength prediction, ocean wave modelling, memory cell failure, wind engineering, biomedical data processing, thermodynamics of earthquakes, non-linear beam vibrations, and food science. In this study, the focus is on applying the EVT methods to the health insurance sector of the Ghanaian economy in which we find and fit an appropriate limiting distribution to large claims amount submitted by the 37 Military Hospital. We also examine the likelihood 17 University of Ghana http://ugspace.ug.edu.gh of very large claims occurring above a fixed high amount, and the time interval at which these extreme claims occur. 18 University of Ghana http://ugspace.ug.edu.gh Chapter 3 METHODOLOGY One of the main purposes of statistics is to take and analyse a portion of a population in order to make conclusive statements about the characteristics of the population as a whole. Usually, if focus is on the behaviour of the central part of a distribution, then Central Limit Theory (CLT) helps in addressing related questions. However, if the subject of interest is explicit about the behaviour of the largest (or lowest) realisations, it becomes statistically prudent to employ Extreme Value Theory (EVT) - a set of statistical techniques for modeling and estimation of rare events (Minkah, 2016; Nortey et al., 2017) - to address related questions. We begin by first considering the convergence of maxima. 3.1 Convergence of Maxima One of the key concerns in the financial sustainability of any private health insurance scheme lies in making accurate projections maximum claims and when they occur. Consider X1, . . . , Xn as a sample of independent and identically distributed claims having a common continuous but unknown distribution F , then we can define the maximum value of 19 University of Ghana http://ugspace.ug.edu.gh the sequence as: Mn = max{X1, . . . , Xn}. (3.1) but as the sample size, n → ∞, Mn → xF , where xF is the upper endpoint of F and is defined as xF = sup{x ∈ ℜ : F (x) < 1}. Thus, when we increase the sample size and we consider the largest observation, it will keep moving to the right and finally converge to the upper endpoint. Also, the probability distribution of Mn is defined as: P (Mn ≤ x) = P (X1 ≤ x, . . . , Xn ≤ x) = P (X1 ≤ x)× · · · × P (Xn ≤ x) = F (x)n = F n(x). (3.2) But as n → ∞, F n(x) → 1 for x ≥ xF and F n(x) → 0 for x < xF (where xF and xF represent the upper and lower endpoints respectively) so that Mn has a degenerate limiting distribution. But for mathematical and statistical implementation, we need a non-degenerate function. Therefore, in order for the function F n not to degenerate on the upper endpoint, as n increases, we rescale Mn by applying normalising constants for the location and scale. This leads us to: M∗ n = Mn − bn an , where an and bn are normalising constants.. We remark that even though the focus of this paper is on modelling the behaviour of sample maxima, results for sample minima can be obtained from: min(X1, . . . , Xn) = −max(−X1, . . . ,−Xn). 20 University of Ghana http://ugspace.ug.edu.gh 3.2 Maximum Domain of Attraction Usually, it is difficult and sometimes practically impossible to know the true probability distribution of a set of data in real life. We therefore use limiting (asymptotic) distribution to approximate the true distribution of the data. Let us consider the limiting distribution for a large number of Mn, where Mn is as defined in (1). Given an > 0 and bn as any normalising constants such that Mn−bn an converges in distribution, then from (2), P ( Mn − bn an ≤ x ) = P (Mn ≤ anx+ bn) = F n(anx+ bn) ≈ Gξ(x), as n → ∞. (3.3) where Gξ is a non-degenerate distribution function. Thus, the unknown distribution F for the maxima, is said to be in the Maximum Domain of Attraction (MDA) of some distribution Gξ; symbolically stated as, F ∈ MDA(Gξ). In other words, for a sufficiently large sample, the rescaled sample maxima of a data with unknown distribution F is attracted to the limiting distribution Gξ which also belongs to a unique family of distributions - Gumbel family, Fréchet family, and Weibull family (Fisher and Tippett, 1928). 3.2.1 The Gumbel Family of Distributions This class of distributions is usually characterised by density functions whose tails decay exponentially (or light-tailed) as we move towards the upper endpoint of the distribution. They have distribution functions characterised by a shape parameter, ξ = 0, and are of the form: Gξ(x) = e−e−(x−b a ), −∞ 0 and an infinite upper endpoint. A distribution is said to be heavy-tailed if we observe a polynomial decay in the tails of its density function as we move towards the upper endpoint of the distribution. This class of distributions is very often encountered in finance and insurance data. The Fréchet class of distributions take the form: Gξ(x) =  e−( x−b a ) − 1 ξ , x > b, for ξ > 0 0, x ≤ b , (3.5) where a and b are normalising constants and ξ is the shape parameter (Coles, 2001). Also, E(Xk) < ∞ if k < −1 ξ and E(Xk) = ∞ if k ≥ −1 ξ . Examples of distributions that are attracted to the maximum domain of the Fréchet family are Cauchy, Pareto, Fréchet, Student t, and Log-gamma. 3.2.3 The Weibull Family of Distributions This class of distribution always has a finite upper endpoint and a short decay in the tails of its density function (i.e. light-tailed distribution). It has a negative shape parameter and an upper bounded tail at: b− a ξ . The distribution function of the Weibull class is of the form: Gξ(x) =  e−(− x−b a ) − 1 ξ , x < b, for ξ < 0 1, x ≥ b , (3.6) 22 University of Ghana http://ugspace.ug.edu.gh where a and b are normalising constants and ξ is the shape parameter (Coles, 2001). Examples of distributions in the maximum domain of attraction of theWeibull family are Beta distribution, Weibull distribution, and Uniform distribution. The above families of distribution mentioned are the only possible extreme value limiting distributions for the normalised sample maxima, regardless of the underlying distribution function from which the samples are drawn from (Fisher and Tippett, 1928; Gnedenko, 1943; Coles, 2001). As noticed, each of the three families of extreme value distribution give different representation of extreme value behaviour. These representations are primarily determined by the tail characteristic of the distribution function (i.e. the shape parameter ξ); and this can be inferred from the behaviour of the upper endpoint of the underlying distributions. Whiles the upper endpoint for the Weibull class of distributions is finite and that of Fréchet class is infinite, the Gumbel class of distributions can assume both finite and infinite upper endpoint. However, the density function of the Fréchet class of distributions decays faster than that of Gumbel class of distributions. These three standard extreme value distribution classes can be put into practice by EVT through two main approaches - Block Maxima Method (BMM) and Peaks-Over-Threshold Method (POTM). Let us consider the former in the subsequent section. 3.2.4 The Block Maxima Method (BMM) The BMM works by splitting the data into groups (blocks) of equal length, say yearly, monthly, weekly, or even daily. Depending on the available data, the groups can also be formed according to strength, weakness, or even speed (Sérgio, 2012). The BMM then considers the maximum of each block, sets them as a new data (maxima) and then runs analysis by fitting the GEV distribution to the normalised maxima. This approach operates 23 University of Ghana http://ugspace.ug.edu.gh very well for the environmental sciences to the natural blocking (seasons) in nature. However, it can also be used for financial and insurance data even though not highly recommended (Embrechts et al., 2013). One of the main problems associated with the BMM is the choice of block length - a compromise on “bias - variance” relationship. Thus, small block lengths generate more sample maxima for model estimation and consequently reduces the variance and improves the accuracy of the model. However, choosing small block lengths may include intermediate observations which could increase the biasness of parameter estimation in the model. In contrast, choosing large block lengths leads to small number of maxima which satisfy the asymptotic assumptions underlying the model. This eventually results in small bias but at the expense of large variance. For this reason, the use of block length of one year has become more pragmatic when deciding on block length for EVT analysis in climatological and hydrological data, and hence the name Annual Maxima Method (AMM). 3.2.5 The Generalised Extreme Value Distribution (GEVD) In practice, it follows then to select one of the three limiting distribution families and to estimate the parameters of that distribution when applying the EVT framework. However, there is the challenge of choosing a technique for the right selection of the distribution that is best for the data at hand. In order to simplify statistical implementation, Mises (1936) and Jenkinson (1955) unified these three families of distribution into a single “three-parameter” family of distributions called the Generalised Extreme Value Distribution (GEVD). The advantage of this new model is its ability to determine the domain of attraction by using only the shape parameter. This means that, through inference on the shape parameter, there is no need for subjective prior selection of the best limiting distribution to adopt for the data. 24 University of Ghana http://ugspace.ug.edu.gh Definition: Suppose that Mn = max{X1, . . . , Xn} for a set of i.i.d. random variable X, and given any normalising constants a > 0 and b such that P ( Mn−b a ≤ x ) ≈ Gξ(x) as n → ∞, where Gξ(x) is a non-degenerate distribution function, then according to Jenkinson (1955), Gξ(x) belongs to the family of distributions which take the form: G(x;µ, σ, ξ) =  e−[1+ξ(x−µ σ )] − 1 ξ , for, ξ ̸= 0, 1 + ξ(x−µ) σ > 0 e−e −(x−µ σ ) , for, ξ = 0, −∞ < x < ∞ , (3.7) where −∞ < µ < ∞, σ > 0 and ξ represent location, scale and shape parameters respectively. Furthermore, the distribution function G is in the Gumbel domain if ξ = 0; Fréchet domain if ξ > 0 and Weibull domain if ξ < 0. In literature, ξ is usually referred to as the Extreme Value Index (EVI) or tail index as it is a key feature in determining the heaviness of the tail of an underlying unknown distribution function, and consequently its asymptotic distribution. Sometimes, we can have models from different classes of distributions as a plausible fit to a dataset. It then becomes necessary to run more objective tests in order to select the model that best fits the data. For instance, given that the shape parameter of a proposed model belonging to the Weibull class of distributions has a confidence interval that includes zero, or has an upper confidence interval which is very close to zero; it means that the data at hand can also be fitted with another model from the gumbel class of distributions. Hence, we test for the feasibility of the “gumbel-class” model in such circumstances. The first point of call will be the diagnostic plots for the “Gumbel-fit”. Here, an “eye-ball” inspection of the linearity in both the probability and quantile plot can be a good indicator of which model is superior given the available data. Thus, we choose the model in which such linearity is better achieved. 25 University of Ghana http://ugspace.ug.edu.gh We can also perform the following hypothesis test to determine if the “Gumbel-class” model is appropriate for the data or not by setting H0: The “Gumbel-class” model is fit for the data vs. H1: The “Gumbel-class” model is not fit for the data. We reject H0 if the test statistic is greater than the critical value (or if the p-value is less than the significance level of the test). Here, the test statistic used is called the Gumbel statistic (Sérgio, 2012) and is given by: GSm = Ym:m − Ym 2 +1:m Ym 2 +1:m − Y1:m . Under the test, the following holds for the validity of H0: GS∗ m = GSm − bm am →m→∞ Λ where bm and am are coefficients of attraction to the Gumbel class of distributions (Sérgio, 2012), and therefore we reject H0 if GS∗ m ≤ Gα at a significance level α, where Gα represents the standard Gumbel quantile. Last but not least, we can also compare the negative loglikelihood, the Akaike Information Criterion (AIC), and the Bayesian Information Criterion (BIC) of the two models and choose the model with the smallest score (Gilleland and Katz, 2016). When comparing models fitted by maximum likelihood to the same data, the negative log-likelihood can be used as a measure of model fit. Assuming Fθ(x) is the distribution function of the random variable X, then the likelihood function is given as L(θ|x1, . . . , xn) =∏n i=1 f(xi|θ) where x is the observed value and θ is the parameter of interest and the log-likelihood as lnL(θ|x1, . . . , xn) = ∑n i=1 lnf(xi|θ). The MLE then tends to maximise the likelihood L(θ) over all possible values of x; and the higher the value, the “better”. Consequently, a lower negative log-likelihood value (i.e. close to 0) is an indication of a better model fit. 26 University of Ghana http://ugspace.ug.edu.gh As explained by Pinheiro and Grotjahn (2015) and also by Gilleland and Katz (2016), Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) are statistical measures that estimates the relative quality of a model by assessing the amount of information lost by the model taking into account the model’s structure and its goodness-of-fit. Thus, these two statistical measures are aimed at achieving parsimony in the model by penalising models with more parameters. The smaller the AIC / BIC (i.e. the less information lost), the more preferable the model is. The AIC is given as 2k − 2ln(L), whilst the BIC is given as [ln(n)− ln(2π)]k−2ln(L) where k is the number of parameters, L is the maximised value of the likelihood function and n is the number of data points. 3.2.6 Estimation of Parameters Usually, it is difficult to know the exact numerical value that summarises the properties of an underlying unknown distribution; we can only approximate it with a “near-value” by estimating the parameters of the corresponding limiting distribution. In the case of fitting the Generalised Extreme Value Distribution (GEVD) to the sample maxima, there is the need to estimate the value of the location (µ), scale (σ), and shape (ξ)) parameters if we desire to make any meaningful inference about the behaviour of extreme events. In literature, there exist a number of estimation methods for the parameters of the GEVD. These include Maximum Likelihood Estimation, L-Moments, Probability Weighted Moments, and the Bayesian estimation. In this section, we shall consider the two most popular ones used in several EVT applications - Maximum likelihood Estimation (MLE) and Probability Weighted Moments Estimation (PWME). The method of maximum likelihood is a routine procedure for obtaining the estimators for 27 University of Ghana http://ugspace.ug.edu.gh unknown parameters of a distribution of a data set. It assumes that the data are realisations of independent and identically distributed random variables and then defines a likelihood function in terms of the observed data and the parameters. The ML estimate is then obtained by maximising the likelihood function or the corresponding log-likelihood function - provides a monotone transformation by changing the products into sums - with respect to the unknown parameter of interest. Though for smaller sample size (say, n < 50), the MLE is unstable and can give unrealistic estimates for the shape parameter (Coles and Dixon, 1999), nevertheless, the adaptability of the Maximum Likelihood to changes in model structure such as missing values and non-stationarity, as compared to other parameter estimation methods, makes it more preferable (Coles, 2001; Sérgio, 2012). Now, suppose that x1, . . . , xm are independent realisations of sample maxima with a common GEVD, the log-likelihood function in terms of the GEVD parameters when ξ ̸= 0 is of the form: ℓ(µ, σ, ξ|x1, . . . , xm) = −m lnσ − ( 1 ξ + 1 ) m∑ i=1 ln [ 1 + ξ ( xi − µ σ )] − m∑ i=1 [ 1 + ξ ( xi − µ σ )]− 1 ξ (3.8) provided 1 + ξ ( xi − µ σ ) > 0, for i = 1, . . . ,m (3.9) and when ξ = 0, the log-likelihood is given as: ℓ(µ, σ, ξ|x1, . . . , xm) = −m lnσ − m∑ i=1 ( xi − µ σ ) − m∑ i=1 e−( xi−µ σ ) (3.10) The maximum likelihood estimators (µ̂, σ̂, ξ̂) for the unknown parameters (µ, σ, ξ) is found by maximising (8) or equivalently (10) with respect to the parameter of interest. 28 University of Ghana http://ugspace.ug.edu.gh As stated by Coles (2001), the standard MLE regularity conditions for asymptotic efficiency and consistency are not automatically applicable in EVT. This is because the endpoints of the GEVD are functions of the parameter values, where µ − σ ξ is the upper and lower endpoint for GEVD if ξ < 0 and if ξ > 0 respectively. This limitation makes the common regularity conditions that is associated with the MLE to be invalid. Smith (1985) studied this problem and stated the following restrictions on the basis of the shape parameter ξ: • The MLE is regular, and have standard asymptotic properties when ξ > −1 2 • The MLE is unlikely attainable when ξ < −1 • The MLE is attainable, but do not have standard asymptotic properties when −1 < ξ < −1 2 The second most popular estimation method employed in EVT is the Probability Weighted Moments Estimator (PWME). Introduced by Landwehr et al. (1979) and Greenwood et al. (1979), the PWME is a generalised version of the Method of Moments Estimator (MME) technique for parameter estimation. However, the PWME assigns more weight to tail observations of the distribution. According to Diebolt et al. (2008), the PWME is commonly employed in hydrological and climatological data due to its easy implementation and performance for distributions encountered in the geosciences. Similar to the classical Method of Moments Estimation (MME), the underlying notion of the PWME is to match the moments with their empirical counterparts. Thus, Mp,r,s = E[Y p(F (Y ))r(1− F (Y ))s] (3.11) where p, r, and s are real numbers and Y is a random variable with distribution function F . In the context of a GEVD for a sample maxima (y1, . . . , yp), parameter estimation by use of PWME is given by β = M1,r,0 = E[Y (F (Y ))r], r = 0, 1, . . . (3.12) 29 University of Ghana http://ugspace.ug.edu.gh As demonstrated by Landwehr et al. (1979), suppose we have a non-decreasing order statistics Y1:p ≤ · · · ≤ Yp:p associated with the random sample maxima, (Y1, Y2, . . . , Yp), which has been divided into p blocks of size q, and assuming the sample follows a GEVD, then the empirical estimator β̂r of the PWME is of the form: β̂r = M̂1,r,0 = 1 p p∑ i=1 ( (i− 1)(i− 2) · · · (i− r) (p− 1)(p− 2) · · · (p− r) ) Yi:p = 1 p p∑ i=1 ( r∏ k=1 i− k m− k ) Yi:p (3.13) Hosking et al. (1985) studied and estimated the parameters of the GEVD for the sample described using the PWME and proved that when ξ ̸= 0, the PWME is of the form: β = M1,r,0 = E[Y (F (Y ))r] = 1 r + 1 [ µ− σ ξ (1− (r + 1)ξΓ(1− ξ) ] for ξ < 1 and ξ ̸= 0 where r = 0, 1, . . . (3.14) Hence for r = 0, 1, 2, the PWM estimators (µ̂, σ̂, ξ̂) for the unknown parameters (µ, σ, ξ) is obtained by solving the following system of equations β0 = M1,0,0 = µ− σ ξ (1− Γ(1− ξ)) 2β1 − β0 = 2M1,1,0 −M1,0,0 = σ ξ Γ(1− ξ)(2ξ − 1) 3β2−β0 2β1−β0 = 3M1,2,0−M1,0,0 2M1,1,0−M1,0,0 = 3ξ−1 2ξ−1  (3.15) so that, µ̂ = M̂1,0,0 + σ̂ ξ̂ (1− Γ(1− ξ)) σ̂ = ξ̂(2M̂1,1,0−M̂1,0,0) Γ(1−ξ̂(2ξ−1)) 3M̂1,2,0−M̂1,0,0 2M̂1,1,0−M̂1,0,0 = 3ξ−1 2ξ−1  (3.16) 30 University of Ghana http://ugspace.ug.edu.gh where the moments βr is replaced by their empirical counterparts in (13), and (β̂0, β̂1, β̂2) corresponds to (µ̂, σ̂, ξ̂). It worth noting that ξ is obtained by solving the last equation of (16) numerically. For ξ = 0, the PWME (µ̂, σ̂) is given by the solution to the equation systems: M̂1,0,0 = µ+ σΓ(1) 2M̂1,1,0 − M̂1,0,0 = ln 2σ  (3.17) where M̂1,r,0 = 1 p ∑p i=1 (∏r k=1 i−k m−k ) Yi:p so that µ̂ = M̂1,0,0 − σΓ(1) σ̂ = 2M̂1,1,0−M̂1,0,0 ln 2  (3.18) Hosking et al. (1985) indicated that for small sample sizes, the PWME outperforms the MLE in terms of minimum variance. He continued that this advantage has made the PWME more preferable in recent analysis of extreme values because of sparse data (few observations) which is usually available for extrapolation of rare events. However, he mentioned that unlike the MLE, the PWME has difficulty in estimations with complex model structures. 3.2.7 Estimation of Extreme Quantiles Aside the location, scale, and shape parameters that are usually estimated when fitting the GEVD for inferences on the behaviour of extreme occurrences; there is the need to also estimate other equally important parameters such as return levels and their corresponding return periods and exceedance probabilities. These are extreme quantiles that form part of the characterisation of the behaviour of uncommon events. Stated differently, the expected time interval (i.e. return period) within which extreme events are likely to occur or reoccur, 31 University of Ghana http://ugspace.ug.edu.gh and the magnitude (i.e. return level) with which they are expected to occur, at a given probability (i.e. exceedance probability) are of key importance for the behavioural study of extreme events. Assuming an i.i.d. random variable Y with a distribution function F (Y ) = P (Y ≤ y), then for the probability 0 < α < 1, the αth quantile of Y is given as the inverse cumulative distribution function: Q(α) = F−1(α) = inf(y : F (y) ≥ α) (3.19) Now, let y1−α be the extreme quantile of order (1− α) of the GEVD for a random maxima variable Y . With a sufficiently small probability α, we can obtain estimates of extreme quantiles by inverting the GEVD function in (7). Thus, y1−α = G−1 ξ (1− α) =  µ− σ ξ [1− (− ln(1− α))−ξ], if ξ ̸= 0 µ− σ ln(− ln(1− α)), if ξ = 0 (3.20) where the parameters (µ, σ, ξ) is replaced by their estimators (µ̂, σ̂, ξ̂). Now, setting yα = − ln(1− α) , we can rewrite eqn(20) in terms of the GEV parameters as: y1−α = G−1 ξ (1− α) =  µ− σ ξ [1− y−ξ α ], if ξ ̸= 0 µ− σ ln(yα), if ξ = 0 (3.21) Recall that for ξ < 0, the limiting GEVD has a finite upper endpoint, xF which can be estimated as: xF = µ− σ ξ , (3.22) where µ, σ, and ξ are replaced by their respective ML or PWM estimator. 32 University of Ghana http://ugspace.ug.edu.gh Following the BMM, suppose Mp = max(Y1, . . . , Yp) denotes the maximum of a block for i.i.d. random variable Y with unknown distribution F ; and assuming the obtained p sample maxima (Mp1, . . . ,Mpp) follows a GEVD, Gµ,σ,ξ, then from eqn(3), we can write FY ≡ FMp = F p ≈ Gµ,σ,ξ (3.23) so that the (1 − α) quantile of the original data Y , denoted y1−α, can be written using the relations in (20) as: F−1(y1−α) p = (1− α)p ≈ G−1 µ,σ,ξ(y1−α) so that y1−α = G−1 µ,σ,ξ(1− α)p =  µ− σ ξ [1− (− ln(1− α)p)−ξ], if ξ ̸= 0 µ− σ ln(− ln(1− α)p), if ξ = 0 (3.24) Again, the parameters (µ, σ, ξ) is replaced by their ML or PWM estimators (µ̂, σ̂, ξ̂). Now, given that the distribution of sample maxima observed over a non-overlapping and equal time periods follows a GEVD Gξ, the return level can be defined as: Rk n = ( 1− 1 k ) , where Rk n is the return level expected to be exceeded with the probability 1 k , once every k intervals (or return periods) of length n. Hence from (20) we can write the return level estimate as: R̂k = G−1 ( 1− 1 k ) =  µ̂− σ̂ ξ̂ [1− (− ln(1− 1 k ))−ξ̂], if ξ ̸= 0 µ̂− σ̂ ln(− ln(1− 1 k )), if ξ = 0 (3.25) 33 University of Ghana http://ugspace.ug.edu.gh On a logarithmic scale, a return level plot (i.e. a plot of (1 − α) against ln(yα)) should be linear when ξ = 0, or should be convex with a finite endpoint when ξ < 0, and concave with an infinite endpoint when ξ > 0. Coles (2001) holds that care should be taken when interpreting inferences on the return levels corresponding to long return periods because precision measures of an estimator are based on an assumption that the model is correct. 3.3 Peaks-Over-Threshold Method (POTM) The Peaks-Over-Threshold Method (POTM) is the second main approach used for analysing behaviour of rare events in EVT. It considers all available data and focuses on the observations that exceed a given fixed high level (threshold). It then fits an appropriate parametric model to the excesses (exceedances) over that fixed high threshold with an asymptotic distribution (i.e. the Generalised Pareto Distribution (GPD)) given that the requisite assumptions are satisfied. Suppose X1, X2, . . . , Xn are a sequence of independent and identically distributed random variables with an unknown distribution function F (x) = P (X ≤ x). Given a designated high threshold u, we consider those X beyond u as being exceedances of u. That is to say, an exceedance of a threshold u occurs whenever X > u. In this sense, we can define the excess value, y, over a given threshold u as y = x − u where 0 < y < xF − u with xF being the upper endpoint of the distribution function F . We then define the probability distribution (also called the conditional excess distribution function) of Y as: Fu(y) = P (X − u ≤ y|X > u) = P (X≤y+u,X>u) P (X>u) = F (y+u)−F (u) 1−F (u) Fu(y) = F (x)−F (u) 1−F (u) since x = y + u (3.26) 34 University of Ghana http://ugspace.ug.edu.gh so that F (x) = Fu(y)[1− F (u)] + F (u) (3.27) In practise, it is not possible to know the exact distribution of F in order to know the corresponding distribution of the excesses, hence we use asymptotic distributions. Cited in Coles (2001), Balkema and de Hann (1974) and Pickands (1975) provided that, for a sufficiently large threshold u, the distribution of exceedances can be approximated by the Generalised Pareto Distribution (GPD). Thus, Balkema and de Haan (1974) and Pickands (1975) showed that if F is a distribution function of a random sample (X1, . . . , Xn) with an upper endpoint (xF = sup[x : F (x) < 1]); and Fu is the distribution for the excesses (y : y = x− u) over a threshold u, where y ∈ [0, xF − u)] then lim u→xF |Fu(y)−Hξ,σu(y)| = 0 (3.28) where σu > 0 and ξ are the scale and shape parameter of the GPD function H. In other words, the GPD family is the limiting distribution for normalised excesses over an increasingly large threshold. 3.3.1 Selection of Threshold The POTM works by analysing observations that exceed a predetermined high threshold level. Similar to the issue on the choice of block length for estimation purposes in the BMM, there is also the issue of choice of threshold level for estimation purposes in the POTM and that decision still remains a subjective matter - implying a compromise between bias and variance. Stated clearly, fixing a low threshold level increases the number of observations (exceedances) 35 University of Ghana http://ugspace.ug.edu.gh for model estimation and leads to more precise model (i.e. a decrease in model’s variability). Nonetheless, it introduces some less extreme observations from the centre of the distribution and this leads to an increase in estimation bias. On the other hand, selecting a high threshold level only gives off few exceedances and renders the estimator to be highly sensitive (i.e. a high variance estimator) to sample size . However, it ensures that the estimated value is close to the true parameter value. As a result of this “trade-offs”, a careful combination of some techniques are considered in determining a suitable threshold that provides a reasonable approximation for the limiting model. Two graphical methods were proposed by Coles (2001) in that regard: the first method is based on fitting the model at a range of different thresholds and then selecting the threshold level where there seems to be stability in the parameter estimates; and the second method is based on the mean excess function (or mean excess plot) - a tool generally used to help in the selection of threshold level and also to determine the suitability of the GPD model. This tool is quiet common in the field of insurance. Considering the latter tool, suppose we assume a random variable X with a distribution F and given a fixed high threshold u, then the distribution of excesses of X over u, i.e. Y = X − u, is given as: Fu(y) = P (X − u ≤ y|X > u) = F (y + u)− F (u) 1− F (u) for 0 ≤ y ≤ xF − u. Also, if E[X] is finite, then the Mean Excess Function (MEF) - the mean w.r.t. Fu - is defined as: e(u) = E(X − u|X > u) = ∫ xF−u 0 F̄u(y)dy = ∫ xF u F̄ (x)dx F̄ (u) (3.29) Additionally, suppose the mean of a random variable X having a GPD with parameter σ 36 University of Ghana http://ugspace.ug.edu.gh and ξ is given as E(X) = σ 1−ξ , then the distribution of excesses of X over a fixed threshold u also follows a GPD with parameter σ + ξu and ξ, with a mean excess function defined as: e(u) = σu + ξu 1− ξ , for all u : σu + ξu > 0 (3.30) Now, assuming the distribution of the random variable Y (excesses) over a threshold level u is Fu(y) ≈ Hσu,ξ, where H is a GPD with parameters ξ and σu > 0, then for any higher threshold level v ≥ u, the distribution of excesses over the higher threshold v remains a GPD with same shape parameter, but a varying scale parameter with a linear growth in v. Thus, Fv(y) ≈ Hσu+ξ(v−u),ξ Therefore, for ξ < 1, the MEF for the higher threshold v ≥ u is given as: e(v) = E[Hσu+ξ(v−u),ξ] = σu + ξ(v − u) 1− ξ = ξv 1− ξ + σu − ξu 1− ξ , v ∈ [u, xF ] (3.31) where xF = u− σu ξ if ξ < 0 and xF = ∞ if 0 ≤ ξ < 1. Both (30) and (31) implies the fact that the MEF (also known as Mean Residual Life Function) is a linear function of the threshold. The diagnostic tool for examining this linearity is the sample mean excess plot. Given the data X1, . . . , Xn, the MEF can empirically be estimated by its sample mean excess function as the sum of excesses over the threshold u divided by the exceedances (i.e. number these excesses). Thus, e(u) = ∑n i=1(X − u)IX>u∑n i=1 IX>u where I is an indicator function and equal to 1 for x > u, or otherwise equals to 0. The 37 University of Ghana http://ugspace.ug.edu.gh Sample Mean Excess Plot (SMEP) is therefore a plot of {(X(i,n), en[X(i,n)]) : 1 ≤ i ≤ n− 1}, where X(i,n) denotes the ith order statistic of the sample size n. Coles (2001) stated that the estimates for the mean excess sample should change linearly with the threshold for which the GPD model is deemed appropriate. Concordantly, Davison and Smith (1990) used the property that the linearity of the mean excess function for increasing thresholds is an indication that the GPD model is valid for the underlying data. Gilleland and Katz (2016) stated that, in using the MEP to aid in threshold selection, the focal point is to locate the smallest possible threshold whereby a straight line can be deduced from that point to higher thresholds for a given uncertainty bounds. Now, when the graph of the SMEP is a fairly positive straight line, then a Pareto Type I distribution (i.e. a GPD model with a positive shape parameter) is assumed to be a suitable fit for the underlying data. On the other hand, a fairly negative straight line is an indication that a Pareto Type II distribution (i.e. a GPD model with a shape parameter less than 0) will best fit the underlying data, and a fairly horizontal line would mean that data is exponentially distributed and hence the appropriate GPD model should have a shape parameter equal to 0. Also, if a GPD model for a threshold level u fits a data, then for higher threshold levels v ≥ u, we apply the same GPD model but with a linear increase in the mean excess function en(v). Coles (2001) and Sérgio (2012) independently indicated that the interpretation of the mean excess plot (or mean residual plot) is not always straightforward especially when there are too few exceedances to make meaningful inferences. As a result, fixing of the threshold level 38 University of Ghana http://ugspace.ug.edu.gh should be made at the point to the right where there is a roughly linearly pattern in the plot. The next graphical technique aside the MEP that aids in the selection of threshold is the Parameter Stability Plot (PSP). This technique involves estimating the GPD parameters at a range of thresholds (or order statistics) and choosing that threshold where the estimated parameters becomes fairly stable (Coles, 2001; Beirlant et al., 2006). Coles (2001) indicated that above a threshold level u for which the GPD is a valid model, estimates of the shape parameter ξ should be approximately constant while that of the scale parameter σu should be linear in u. In sections that follows, we look at the model fitting and the estimation of the key parameters of interest. 3.3.2 The Generalised Pareto Distribution (GPD) Suppose the probability, P ( Mn−bn an ≤ x ) , of a normalised maxima,Mn, converges in distribution to a non-degenerate GEVD Gξ as n → ∞, where an > 0 and bn are normalising constants; then for sufficiently high threshold u, the distribution function of the excesses, (X − u), above u can be approximated by the Generalised Pareto distribution class (Coles, 2001; Sérgio, 2012 given as: H(x;u, σu, ξ) =  1− ( 1 + ξ(x−u) σu )− 1 ξ , if ξ ̸= 0, for 1 + ξ(x−u) σu ≥ 0 1− e−( x−u σu ), if ξ = 0, for x ≥ 0 (3.32) where ξ and σu are the shape and scale parameters respectively. Furthermore, for ξ < 0, ξ > 0, and ξ = 0, the GPD turns to the Weibull class of distributions (with an upper endpoint of u− σu ξ ), the Pareto (or Fréchet) class of distributions (with infinite upper endpoint), and the Gumbel class of distributions respectively. Likewise the GEVD, the shape parameter ξ is dominant in determining the characteristics of the underlying GPD. 39 University of Ghana http://ugspace.ug.edu.gh 3.3.3 Estimation of Parameters In literature, there are several methods of estimating the GPD parameters. However, we will consider just the Maximum Likelihood Estimation (MLE) and the Probability Weighted Moment Estimation (PWME) due to their popularity in usage under EVT. Likewise under the GEVD, using the MLE to approximate parameter values for the GPD requires the formation of the likelihood function, or equivalently, the log-likelihood function and then maximising that function with respect to the parameter of interest in order to get their estimates. Suppose the i.i.d. random variables X1, . . . , Xn ∼ F where F ∈ MDA(Gξ), for some ξ ∈ ℜ and X1, . . . , Xp are p exceedances of a threshold u, such that Yi = Xi − u, i ∈ (1, . . . , p) are the corresponding excesses. Then with the excesses y1, . . . , yp independent and roughly distributed as a GPD function, H(x;u, σu, ξ), the log-likelihood function is given as: ℓ(σu, ξ|y1, . . . , yp) =  −p lnσu − ( 1 + 1 ξ )∑p i=1 ln ( 1 + ξyi σu ) , if ξ ̸= 0 −p lnσu − 1 σu ∑p i=1 yi, if ξ = 0 (3.33) To simplify arithmetic, Coles (2001) recorded that Davison introduced a reparameterisation of the log-likelihood function to get ξ̂ explicitly as a function of ζ̂ by defining ζ = ξ σu so that the log-likelihood function can be rewritten as: ℓ(σu, ξ|y1, . . . , yp) = −p ln ξ + p ln ζ − ( 1 + 1 ξ ) p∑ i=1 ln(1 + ζyi) if ξ ̸= 0 (3.34) The maximum likelihood estimators (σ̂u, ξ̂) or (ζ̂ , ξ̂)is then obtained by maximising (33) or respectively (34) with respect to the parameter of interest. For ξ = 0, we have the case of a classical exponential distribution. 40 University of Ghana http://ugspace.ug.edu.gh We now look at the Probability Weighted Moment Estimation (PWME). We recall that for a random variable X, the PWM is defined as Mp,r,s = E[(Xp(F (X)r(1− F (X)))s] where p, r, s ∈ ℜ. Hosking et al. (1985) proposed that to estimate the GPD parameters using the PWME, we take Mp,r,s with p = 1, r = 0, and s = 0, 1, . . . so that the moment is defined as: M1,0,s = σu (s+ 1)(s+ 1− ξ) , for ξ < 1 (3.35) Also, the empirical estimate of the moment M1,0,s is given by: M̂1,0,s = 1 p p∑ i=1 ( s∏ j=1 p− i− j + 1 p− j ) Xi:p (3.36) where Xi:p is the ith order statistic of the p number of exceedances over a fixed threshold. Therefore replacing M1,0,s with its estimator M̂1,0,s and solving for s = 0 and s = 1 with respect to the parameters ξ and σu respectively, yields the PWM estimator for the shape and scale parameter as: ξ̂ = 2− M̂1,0,0 M̂1,0,0−2M̂1,0,1 σ̂u = 2M̂1,0,0M̂1,0,1 M̂1,0,0−2M̂1,0,1  (3.37) Hosking et al. (1985) established that for small samples sizes, the PWME is found to have smaller variance and hence performs better than the MLE. However, Sérgio (2012) indicated that for ξ ≥ 1, the PWME do not exist, and in some cases, the observations may fall beyond the estimated right endpoint xF . 41 University of Ghana http://ugspace.ug.edu.gh 3.3.4 Estimation of Extreme Quantiles Similar to the BBM, useful indicators such as quantile estimates for extreme value analysis can also be obtained under the POTM by inversion of the GPD function. In other words, by inverting the GPD function in (32) and substituting for ξ and σu with their respective ML of PWM estimates, the (1 − α) quantile of exceedances over a fixed threshold can be estimated as: y1−α = H−1 σu,ξ (1− α) =  σu ξ (α−ξ − 1), if ξ ̸= 0 −σu lnα, if ξ = 0. (3.38) where 0 ≤ α ≤ 1. Also, if ξ < 0, the associated GPD model has a finite upper endpoint whose estimate is given as: x̂F = u− σ̂u ξ̂ . (3.39) Also, the quantile (or tail) estimators can be expressed in terms of their original random variable X with the unknown distribution F using relations in (26) and (27). Thus, given that F (x) = Fu(y)[1 − F (u)] + F (u) and Fu(y) is asymptotic to the GPD, Hσu,ξ(y), for sufficiently large u, where y = x− u for x > u. Then F (x) = Hσu,ξ(x− u)[1− F (u)] + F (u). Replacing F (u) with its empirical estimator,n−eu n Smith (1985), where n is the sample size and eu is the number of exceedances over the threshold u, we can write the following tail 42 University of Ghana http://ugspace.ug.edu.gh estimator: F̂ (x) = Hσu,ξ(x− u)[1− n−eu n ] + n−eu n = { 1− ( 1 + ξ(x−u) σu )− 1 ξ }[ eu n ] + n n − eu n = eu n [ 1− ( 1 + ξ(x−u) σu )− 1 ξ − 1 ] + 1 F̂ (x) = 1− u n ( 1 + ξ(x−u) σu )− 1 ξ (3.40) where ξ and σu is substituted with their respective ML or PWM estimates. Inverting the last line of (40) and solving for a given probability α yields the (1 − α) extreme quantile estimator as: F−1(α) =  u+ σu ξ [(n(1−α) eu )−ξ − 1], if ξ ̸= 0 u+ σu ln( n(1−α) eu ), if ξ = 0 (3.41) where ξ and σu is substituted with their respective MLE or PWME. Thereafter, measures such as Value-at-Risk and Expected Shortfall can be computed. 3.4 Value-at-Risk (VaR) and Expected Shortfall (ES) One of the most frequent concerns of risk assessment in finance and insurance is the indicative values of extreme quantiles such as Value-at-Risk (VaR) and Expected Shortfall (ES). Most regulators in today’s financial and insurance systems have come to accept these measures as the benchmark tool for market risk measurement and as a basis of capital requirements for risk exposure. Basically, VaR is a high quantile (usually a 99th or 99.9th percentile) of the distribution of returns that offers an estimated boundary which we cannot exceed within a specified period, for a given level of confidence (Uppal, 2013; Allen et al., 2011). Though the BMM can be employed in its estimation, the POTM, through the GPD, provides a more suitable means for estimating these high quantiles . 43 University of Ghana http://ugspace.ug.edu.gh In terms of the GPD parameters, we thus define V aRα as the αth extreme quantile of the distribution which is estimated as: ˆV aRα = u+ σ̂u ξ̂ [( n(1− α) eu )−ξ̂ − 1 ] , (3.42) where 0 < α < 1, x ≥ u, σu > 0, ξ ∈ ℜ and F̂ (u) = n−eu n . Though the VaR has the advantage of easy interpretation and allows direct comparison of risk in a diverse portfolio, it ignores the potential size of the return in case the calculated VaR is exceeded at given probability (Szubzda and Chlebus, 2020). This brings to light the Expected Shortfall. The Expected Shortfall (also called the Conditional Value-at-Risk (CVaR)) is a measure that takes into consideration the potential sizes of the returns expected to exceed a given VaR. Stated differently, it is the average value of the returns above the VaR. Thus, ESα = 1 α ∫ 1 α V aRα. (3.43) According to Nortey et al. (2015) the Expected Shortfall can also be defined as: ÊSα = ˆV aRα + E[X − ˆV aRα|X > ˆV aRα]. (3.44) But the expression E[X − ˆV aRα|X > ˆV aRα] is simply the expected value of excesses over the threshold level of V aRα, which is similar to the conditional mean excess function. Then by definition of (30), we can write such expression as: E[X − ˆV aRα|X > ˆV aRα] = e(V aRα) = σ̂u + ξ̂( ˆV aRα − u) 1− ξ̂ (3.45) 44 University of Ghana http://ugspace.ug.edu.gh Substituting (45) into (44), we have: ÊSα = ˆV aRα + σ̂u + ξ̂( ˆV aRα − u) 1− ξ̂ = ˆV aRα 1− ξ̂ + σ̂u − ξ̂u 1− ξ̂ (3.46) 3.5 Block MaximaMethod (BMM) Vs. Peaks-Over-Threshold Method (POTM) Modelling normalised sample maxima with only BMM is a wasteful approach for analysing extreme value behaviour if all observations are available in the data. Thus, it is statistically more efficient to disregard the BMM and instead, use the Peaks-Over-Threshold Method (POTM) if an entire time series of observations is accessible to the practitioner. Coles (2001) explained the wastefulness as this - given a model that describes variations in maxima from one period to another, realisations that are lower in one block but more extreme than the maxima of other blocks are excluded from the analysis. He added that rather than artificially blocking the data, say annually, and extracting the annual maximum from each block, it is more efficient to define a realisation as being extreme if it falls above some predetermined high level. Consequently, efficiency is improved since all information about the observed values, in terms of exceeding a high fixed threshold, can be used in model fitting. Nonetheless, Diebolt et al. (2008) stated that the BMM may still be preferred for the following reasons. Foremost, it is computationally easier to only focus on block maximums especially when the practitioner is often faced with very high number of time series data to analyse, as with an hourly climatological data. Secondly, working with blocks of a given size, say one year, makes interpretability of the estimated parameters to the environmentalist more appreciable due to natural (or seasonal) blocking of environmental data. Lastly, the block maxima may be the only measurements available to the practitioner at the point in time when faced with incomplete or missing data. Stated differently, because the approach 45 University of Ghana http://ugspace.ug.edu.gh considers only the maximum of each block, there is little or no need for the rest of the observation below the maxima. For these reasons among others, modelling block maxima with a GEVD remains a very frequent procedure in hydrology and climatology. 46 University of Ghana http://ugspace.ug.edu.gh Chapter 4 DATA ANALYSIS AND FINDINGS The data was obtained from the records of the private health insurance unit of the 37 Military Hospital and comprises daily claims that are submitted monthly to its partnered private health insurance companies - currently 13 in number. The data is from a secondary source and is made up of 32,964 claims spanning a period of 8 years (2012 - 2019). The source of the data also recorded “no-submission” in some months, and “multiple-submissions” - claims from preceding and current months - in other months. As a result, there are 90 (instead of 96) monthly recorded submissions for the 8-year period. The variables contained in the data include “Diagnosis”, “Amount of Medical Services”, “Amount of Drugs” and “Total Amount”. However, the prime variable of interest is the “Amount of Medical Services” - all healthcare services other than drugs, provided to a private health insurance patient. The choice of variable of interest was influenced by the substantive missing data on the “Amount of Drugs” for the period 2018 - 2019. Consequently, the “Total Amount“ for the entire period will be understated and hence becomes unattractive for use in analysis. Furthermore, records from the hospital reveal that most new companies cover for only 47 University of Ghana http://ugspace.ug.edu.gh medical services in the early stage of their partnership and later renew their contracts to cover drugs. This is in line with the overall aim of the study - to provide insights on the likely occurrence of large claim amounts particularly for prospective and new health insurance firms interested in partnering the 37 Military hospital. Therefore, for the purpose of this study, the word “claims” specifically refers to “Amount of Medical Services”. In this chapter, we run preliminary statistical analysis to describe the data, and then employ the parametric EVT approach to describe the behaviour of large claim amounts. Based on the proposed model and its parameter estimates, we forecast the likelihood of surpassing the maximum claim amount in the data and also project the probability of exceeding of certain large claim amounts at a particular period of time. The R-software was used for all statistical computations, with emphasis on “packages” such as evd, extRemes, extremefit, fExtremes, ismev, and POT for the core analysis. 4.1 Preliminary Analysis Table 4.1.1: Summary Statistics Claims Min. Amt Max. Amt Std. Er Mean Skewness Kurtosis 32963 5 15,740 3.06 215.5 7.79 100.7 Table 4.1.1 reveals that on a particular day, the hospital can claim a minimum amount of 5 cedis and a maximum amount of 15,740 cedis for services rendered to patients on the private health insurance basis. These amounts correspond to a minor wound dressing and a tumour surgery respectively. Furthermore, an insurance company can expect an average claim amount of 216 cedis from the hospital at any particular day. Also, the two “so-called” shape statistics - skewness and kurtosis - give signals of more extreme deviations from the centre of the distribution - an indication that the data has a long right-tailed distribution. However, these two measures are not sufficient enough to determine if the underlying distribution for 48 University of Ghana http://ugspace.ug.edu.gh data is heavily tailed or light tailed. Below is a pictorial description of the data at hand. The results were obtained using the R-Software (see Appendix B2) 49 University of Ghana http://ugspace.ug.edu.gh Figure 4.1.1: Scatter Plot and Histogram of Claims Submitted for the Period 2012 - 2019. 3 The scatterplot from Figure 4.1.1 exhibits no clear pattern or trend and hence gives an indication of non-associativity between the variables. Also, the plot shows that more of the claims amount are larger than the average claim size of 216 cedis. This is in concord with the long right-tailed distribution depicted by the histogram. In order to make fair projections from any time series data, we require that the data be stationary. Here, we employ the “Augmented Dickey-Fuller (ADF)” test by setting H0: the data is non-stationary against H1: the data is stationary, at a significance level of 0.05 (Nkrumah, 2017). The results were obtained using the R-Software (see Appendix B3). Table 4.1.2: ADF Stationarity Test for Claim Amounts Variable Test Statistic P-value Claims -22 0.01 From the test, the p-value (0.01) is less than the significance level (0.05). Therefore, we reject the null hypothesis and conclude that the data is stationary. 50 University of Ghana http://ugspace.ug.edu.gh 0 -.;;Ji" 0 0 0 en 0 0 "U Ii) 0 0 © ...- 0 (.J &o 0 0 00 2::- C C0 0 0 en N 0 ij 0 C 0 ....... 0 d) C 0 0 0 ::J 0 0 Ii) E 0 ~ 0 0 0 0 0 10000 .25000 0 5000 10000 Daily Cllaims Claim Amount {i11 oedis) 51 University of Ghana http://ugspace.ug.edu.gh Another preliminary analysis needed to admit the use of Extreme Value Distribution (EVD) in this work is the Exponential QQ-Plot. Using the R-Software (see Appendix B4) we obtain the Exponential QQ-plot for the sample maxima. Figure 4.1.2: Exponential QQ-Plot of Claims The Exponential QQ-plot in Figure 4.1.2 depicts a roughly linear pattern for lower and middle values of the sample but a convex curvature as we approach the upper values. This is an indication that the underlying limiting distribution for the data is light-tailed than expected from an Exponential distribution (Sérgio, 2012). Hence, we consider the plausibility of the extreme value distributions as a limiting distribution the data. With the help of the EVT framework, In this regard, we employ the “GEVD” and the “GPD” fits respectively for the Block Maxima Method and Peaks-Over-Threshold Method of EVT. 52 University of Ghana http://ugspace.ug.edu.gh Q) 0. E co "' -0 ~ Q) -0 0 0 0 0 Ii) ~ 0 0 0 Ii) 0 0 Exponential Q-Q Plot 0 2 3 4 Exponential plotting position 4.2 The Block Maxima Method (BMM) In employing the Block Maxima approach, we split the data into monthly blocks and obtain the maximum from each block. The choice of block length is based on the fact that the hospital submits its claims monthly; and more importantly, we obtain a larger sample size (89 observations) for analysis than when the data is blocked yearly (i.e. 8 observations). Using the MLE and the PWME we approximate the parameters for the appropriate limiting distribution and then use it to estimate important extreme quantiles such as return levels, Value-at-Risk (VaR) and Expected Shortfall (ES) with their corresponding exceedance probabilities. In pursuance of the parametric approach of the EVT under the BMM, we fit the GEVD as a limiting distribution to the monthly claims maxima under the BMM. We also obtain both the MLE and the PWME for the GEVD parameters - location (µ), scale (σ) and shape (ξ). The R-Software (see Appendix A1 and B5) was used to obtain the results . Captured in Appendix A1, the standard errors for the estimated GEVD parameters are unreasonably large - an indication of a less precise model. Taking a “square-root” transformation of the underlying variable results in a more precise model. It is worth noting that though the “log-transformation” of the underlying variable gives much smaller standard errors for the parameter estimates (see Appendix A2), its curvilinear nature inhibits accurate results for the data at hand. This can be seen in the diagnostic plots captured in Appendix A3. Thus, both the probability plot and quantile plot under the “log-transformation” is less linear as compared to that of the “square-root transformation”. The figure below gives the diagnostic plots for fitting the GEV distribution to the sample maxima after the “square-root transformation”. 53 University of Ghana http://ugspace.ug.edu.gh Figure 4.2.1: Diagnostic Plots for the Proposed GEV Model From Figure 4.2.1, both the probability plot and the quantile plot exhibits a very strong linear pattern between the quantiles of the sample and that of the theoretical GEVD - a confirmation that the proposed GEV model is a better fit to the data as compared to the gumbel model. Furthermore, both the theoretical density function and the theoretical CDF of the “GEV-fit” superimposed on the observed sample maxima affirms that the model is appropriate for the data at hand. These results indicate that the limiting distribution for the sample maxima belongs to the GEV families of distributions. 4.2.1 Estimation of Parameters Below is the table for the ML and PWM estimates, and the 95% Profile Likelihood-based Confidence Interval (PL-CI) - a more accurate measure than the normal approximation 54 University of Ghana http://ugspace.ug.edu.gh u. 0 0 0 Empirical and theoretical dens. 0 20 40 60 80 100 140 Data Empirical and theoretical CDFs 0 20 40 60 80 100 140 Data "' ~ c "' ::, 0 er OJ "iii .g 0 ·a. "' E w Q.Q plot t--r ::::1 20 40 60 80 100 120 Theoretical quantiles p.p plot 0.0 0.2 0.4 0.6 0.8 1.0 Theoretical probabilities confidence interval in terms of allowing for skewed intervals for the estimates (Coles, 2001). The R-Software (see Appendix B8 and B9) was used to obtain the results below. Table 4.2.1: Parameter Estimates For The GEV Model Parameter MLE PWME 95% PL-CI Location (µ) 50.69 50.28 [45.29, 56.23] Scale (σ) 23.17 23.56 [19.74, 27.75] Shape (ξ) -0.18 -0.16 [-0.32, -0.0004] Table 4.2.1 shows a very slight difference between the parameter estimates under the MLE and the PWME. Furthermore, the table suggests that the distribution of the sample maxima is attracted to the Weibull class of distributions because the shape parameter in both estimations is less than zero. However, a careful examination of Table 3 indicates a “near-zero” value for the upper bound of the PL-CI for the shape parameter. This can imply a plausibility of the data being attracted to the Gumbel class of distributions. Using the same data at hand, we can perform various statistical test to choose the appropriate GEV model. Having two families of distributions as a plausible fit to the data requires more objective tests in order to select the model that best fits the data. We therefore test for the validity of the gumbel model, just as we did the GEV model using the same data at hand. We first run the d