World bank health data
library(data.table)
library(tidyverse)

Data has been downloaded and is called data.csv. https://www.kaggle.com/theworldbank/health-nutrition-and-population-statistics/data

data <- fread('~/Downloads/data.csv', fill=T, stringsAsFactors=F)
dim(data)
## [1] 89010    61
head(data)
##    Country Name Country Code
## 1:   Arab World          ARB
## 2:   Arab World          ARB
## 3:   Arab World          ARB
## 4:   Arab World          ARB
## 5:   Arab World          ARB
## 6:   Arab World          ARB
##                                                                                                           Indicator Name
## 1: % of females ages 15-49 having comprehensive correct knowledge about HIV (2 prevent ways and reject 3 misconceptions)
## 2:   % of males ages 15-49 having comprehensive correct knowledge about HIV (2 prevent ways and reject 3 misconceptions)
## 3:                                                         Adolescent fertility rate (births per 1,000 women ages 15-19)
## 4:                                                           Adults (ages 15+) and children (0-14 years) living with HIV
## 5:                                                    Adults (ages 15+) and children (ages 0-14) newly infected with HIV
## 6:                                                                                     Adults (ages 15+) living with HIV
##       Indicator Code            1960             1961             1962
## 1: SH.HIV.KNOW.FE.ZS                                                  
## 2: SH.HIV.KNOW.MA.ZS                                                  
## 3:       SP.ADO.TFRT 133.55501327769 134.159118941963 134.857912280869
## 4:       SH.HIV.TOTL                                                  
## 5:    SH.HIV.INCD.TL                                                  
## 6:       SH.DYN.AIDS                                                  
##                1963             1964            1965             1966
## 1:                                                                   
## 2:                                                                   
## 3: 134.504575565342 134.105211273476 133.56962589645 132.675635192775
## 4:                                                                   
## 5:                                                                   
## 6:                                                                   
##                1967             1968             1969             1970
## 1:                                                                    
## 2:                                                                    
## 3: 131.665502129354 129.190980115918 126.736756382819 124.382808900193
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                1971             1972             1973             1974
## 1:                                                                    
## 2:                                                                    
## 3: 122.133431342027 120.020185557559 118.087531093609 116.132988067096
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                1975             1976             1977             1978
## 1:                                                                    
## 2:                                                                    
## 3: 114.100918174437 111.980005447216 109.783821762662 106.033489239906
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                1979             1980             1981             1982
## 1:                                                                    
## 2:                                                                    
## 3: 102.341720681455 98.7390023274647 95.2412508672802 91.7911923993221
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                1983             1984             1985             1986
## 1:                                                                    
## 2:                                                                    
## 3: 88.0011769487606 84.2072557839419 80.3593225600132 76.4415956498419
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                1987             1988             1989             1990
## 1:                                                                    
## 2:                                                                    
## 3: 72.5145803648751 71.1706639452677 69.8887679924858 69.0044133814268
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                1991             1992             1993             1994
## 1:                                                                    
## 2:                                                                    
## 3: 67.7559924352118 66.9284506867798 64.9489678572737 62.9227777228154
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                1995             1996             1997             1998
## 1:                                                                    
## 2:                                                                    
## 3: 60.7070695260477 58.5966308804751 56.4401276304142 55.5315395528949
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                1999             2000             2001             2002
## 1:                                                                    
## 2:                                                                    
## 3: 54.6587808352011 53.8314102398679 52.9015276443892 51.9907926813042
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                2003             2004             2005             2006
## 1:                                                                    
## 2:                                                                    
## 3: 51.5228563035101 51.1032496482833 50.7325902239383 50.3291352282938
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                2007             2008             2009             2010
## 1:                                                                    
## 2:                                                                    
## 3: 49.9998514069402 49.8870459355469 49.7812066054555 49.6729747116906
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##                2011             2012             2013             2014
## 1:                                                                    
## 2:                                                                    
## 3: 49.5360469363113 49.3837446924523 48.7965576984378 48.1964180547578
## 4:                                                                    
## 5:                                                                    
## 6:                                                                    
##    2015 V61
## 1:       NA
## 2:       NA
## 3:       NA
## 4:       NA
## 5:       NA
## 6:       NA
colnames(data) <- gsub(" ", "_", colnames(data))

Dataset consists of countries and larger regions. First I want to investigate the differences in life expectancy per region.

We see that different regions have different life expectancy for both female and male. The survival of men is always lower comapred to females.

regions <- data[data$Country_Name %in% unique(data$Country_Name)[1:41],]
regions2014 <- regions[,c(1:4, 59)]
regions2014 %>%
    filter(Indicator_Name == "Life expectancy at birth, female (years)" | 
        Indicator_Name == "Life expectancy at birth, male (years)" ) %>%
    melt(id=c(1:4)) %>% 
    mutate(Sex = gsub("Life expectancy at birth, |\\(years\\)", "", Indicator_Name)) %>%
    arrange(value) %>%
    filter(!grepl('income|&|dividend|small|Small|area|UN|conflict|poor', Country_Name)) %>%
    mutate(Country_Name = factor(Country_Name, levels=unique(Country_Name))) %>%
    ggplot(aes(x = Country_Name, y = as.numeric(value), fill=Sex, group=Country_Name)) +
        coord_flip()+
        ylim(0,90)+
        #geom_segment(aes(x = 0, x = Country_Name, xend = value), color = "grey50") +
        geom_line() +
    	geom_point(pch=21, size=3) +
        scale_fill_brewer(palette="Set1") + 
    	#scale_fill_manual(values=c('deepskyblue','darkorange2')) +
    	theme_classic() + 
    	theme( axis.title.x = element_text(size=14), axis.text.x = element_text(size=12,colour='black'),  
            axis.text.y = element_text(size=12,colour='black'), axis.title.y = element_text(size=14))+
    	ggtitle('Life expetancy per region') +
    	xlab('Region') + 
        ylab('Life expectancy from birth (years)') 

plot of chunk Female_lifeExpectancy

Further analysis will investigate these differneces