Continuing from the last two weeks, I’m looking at the change in mangrove forest area between 2000-2012, as an indication of the health of these forests and the level of environmental stewardship of the corresponding country.
Mangrove data from here; environmental health/vitality data from the Environmental Performance Index project here (metadata PDF here). See my last two week’s reports for more background.
The specific question is: Which evironmental factors correlate with changes in mangrove forest health? The answer, as we’ll see, may be “Access to Sanitation” and “Fish Stocks”.
We have good mangrove data on 102 countries, but the following 16 of them are not listed in the EPI table, so we need to omit them.
## [1] "New Caledonia" "Palau"
## [3] "Cayman Islands" "Puerto Rico"
## [5] "Micronesia, FS" "Virgin Islands, U.S."
## [7] "Saint Lucia" "Curacao"
## [9] "Virgin Islands, British" "St.Vincent & Grenadines"
## [11] "Turks and Caicos " "Aruba"
## [13] "Anguilla" "Saint Martin"
## [15] "Bermuda"
For the remaining 86 countries, the EPI table gives 37 different indicators of “environmental health” and “ecosystem vitality”.
Several of them are simply weighted combinations of others, so we can throw these out.
Next we need to deal with missing values. The 10 NAs for Nitrogen.Balance and Nitrogen.Use.Efficiency are, according to the metadata, for ten countries that have minimal agricultural output. We’ll fill these in with the mean values.
Since 25 out of the 86 rows have NAs for Trend.in.Carbon.Intensity, I don’t know how to impute values and I’m going to remove this column. In the end, here are our column names.
## [1] "ISO_3_Num"
## [2] "Percent.Change.in.Mangrove.Forest"
## [3] "Country"
## [4] "X2016.EPI.Score"
## [5] "X10.Year.Percent.Change"
## [6] "Environmental.Risk.Exposure"
## [7] "Household.Air.Quality"
## [8] "Household.Air.Quality...Risk.Exposure"
## [9] "Air.Pollution...Average.Exposure.to.PM2.5"
## [10] "Air.Pollution...Average.Exposure.to.PM2.5...Risk.Exposure"
## [11] "Air.Pollution...Average.PM2.5.Exceedance"
## [12] "Air.Pollution...Average.Exposure.to.NO2"
## [13] "Access.to.Sanitation"
## [14] "Unsafe.Sanitation..Risk.Exposure"
## [15] "Access.to.Drinking.Water"
## [16] "Unsafe.Drinking.Water.Quality..Risk.Exposure"
## [17] "Wastewater.Treatment"
## [18] "Nitrogen.Use.Efficiency"
## [19] "Nitrogen.Balance"
## [20] "Tree.Cover.Loss"
## [21] "Fish.Stocks"
## [22] "Terrestrial.Protected.Areas..National.Biome.Weights."
## [23] "Terrestrial.Protected.Areas..Global.Biome.Weights."
## [24] "Marine.Protected.Areas"
## [25] "Species.Protection..National."
## [26] "Species.Protection..Global."
## [27] "Trend.in.CO2.Emissions.per.KwH"
## [28] "Access.to.Electricity"
Let’s use the environmental indicators (columns 4-28) to fit for the values of Percent.Change.in.Mangrove.Forest
.
The only variables with significant p-values are Air.Pollution...Average.PM2.5.Exceedance
(p-value 0.0212) and Air.Pollution...Average.Exposure.to.NO2
(p-value 0.0676). The first is a measure of how often the PM2.5 particulate matter was above certain human health targets. The second measures levels of nitrogen dioxide, another type of air pollution. While I find this analysis disappointing, perhaps it indicates that mangroves are more sensitive to air quality than to other environmental health indicators.
Or, we can take the position that our model has some issues and we need to remove some outliers. Here are the countries with the highest leverage based on dffits
, which measures the change in the predicted response if we remove that country. I also display their actual Percent.Change.in.Mangrove.
## 7 37 73 3 55 45
## Country Myanmar Guatemala Taiwan Malaysia Sri Lanka Brunei Darussalam
## dffits -3.252 -2.845 -2.066 -1.509 1.429 1.275
## MangChange -8.42 -6.411 -3.571 -4.887 -1.203 -0.787
The function cooks.distance
calculates the overall change in regression coefficients when a country is omitted from the calculation. Here are the countries with the largest cooks.distance
.
## 7 37 73 3 55
## Country Myanmar Guatemala Taiwan Malaysia Sri Lanka
## CooksDist 0.320397056 0.246735278 0.1536278 0.079932539 0.075605673
## 45
## Country Brunei Darussalam
## CooksDist 0.062255164
Since the mean for mangrove change is -1.035%, we start to get the picture that the four countries Myanmar, Guatemala, Taiwan, and Malaysia are outliers. This is also confirmed by looking at the diagnostic plots from plot(fit)
.
Let’s take out Myanmar, Guatemala, Taiwan, and Malaysia, and redo the regression. Here are the coefficients and their p-values.
## Estimate
## (Intercept) 0.560828938
## X2016.EPI.Score 0.046361189
## X10.Year.Percent.Change 0.019440734
## Environmental.Risk.Exposure -0.024337020
## Household.Air.Quality -0.004258687
## Household.Air.Quality...Risk.Exposure 0.013797241
## Air.Pollution...Average.Exposure.to.PM2.5 -0.014487383
## Air.Pollution...Average.Exposure.to.PM2.5...Risk.Exposure 0.008212617
## Air.Pollution...Average.PM2.5.Exceedance 0.019504081
## Air.Pollution...Average.Exposure.to.NO2 -0.022110066
## Access.to.Sanitation 0.030076703
## Unsafe.Sanitation..Risk.Exposure -0.018949572
## Access.to.Drinking.Water 0.002012261
## Unsafe.Drinking.Water.Quality..Risk.Exposure -0.017669346
## Wastewater.Treatment -0.009184633
## Nitrogen.Use.Efficiency -0.010581689
## Nitrogen.Balance -0.006131900
## Tree.Cover.Loss -0.001056948
## Fish.Stocks 0.018880971
## Terrestrial.Protected.Areas..National.Biome.Weights. 0.036540091
## Terrestrial.Protected.Areas..Global.Biome.Weights. -0.027023588
## Marine.Protected.Areas -0.001560215
## Species.Protection..National. 0.012044092
## Species.Protection..Global. -0.032422989
## Trend.in.CO2.Emissions.per.KwH 0.002163566
## Access.to.Electricity -0.014282706
## Pr(>|t|)
## (Intercept) 0.78500295
## X2016.EPI.Score 0.42321049
## X10.Year.Percent.Change 0.22688653
## Environmental.Risk.Exposure 0.20803093
## Household.Air.Quality 0.77939509
## Household.Air.Quality...Risk.Exposure 0.51401324
## Air.Pollution...Average.Exposure.to.PM2.5 0.40727772
## Air.Pollution...Average.Exposure.to.PM2.5...Risk.Exposure 0.26321236
## Air.Pollution...Average.PM2.5.Exceedance 0.28979931
## Air.Pollution...Average.Exposure.to.NO2 0.05589238
## Access.to.Sanitation 0.02539785
## Unsafe.Sanitation..Risk.Exposure 0.18859060
## Access.to.Drinking.Water 0.82730202
## Unsafe.Drinking.Water.Quality..Risk.Exposure 0.11276097
## Wastewater.Treatment 0.35949556
## Nitrogen.Use.Efficiency 0.27587576
## Nitrogen.Balance 0.19632978
## Tree.Cover.Loss 0.84125274
## Fish.Stocks 0.02661137
## Terrestrial.Protected.Areas..National.Biome.Weights. 0.12517102
## Terrestrial.Protected.Areas..Global.Biome.Weights. 0.21328762
## Marine.Protected.Areas 0.77755037
## Species.Protection..National. 0.60170904
## Species.Protection..Global. 0.13768863
## Trend.in.CO2.Emissions.per.KwH 0.76443145
## Access.to.Electricity 0.39769715
The variable Air.Pollution...Average.Exposure.to.NO2
is still somewhat significant, with p-value 0.0559
.
But now we have two clearly significant variables. The p-value for Access.to.Sanitation
is 0.0254
, and the p-value for Fish.Stocks
is 0.0266
.
The Access.to.Sanitation
variable measures the percentage of the population that has access to “improved” sanitation, which “hygienically separates human excreta from human contact and is not public or shared, only private”. Furthermore, the coefficient is relatively large at 0.03
. So increasing your population’s access to sanitation by one percentage point is likely to increase your mangrove forest area by 0.03% over twelve years, all other things being held constant.
The Fish.Stock
variable measures “the fraction of fish stocks overexploited and collapsed by exclusive economic zone”. The goal is to have this down to zero. But the coefficient in my regression is 0.0188
, which means increasing the amount of overexploited fish by 1% is likely to correspond to a 0.0188% increase in your mangrove forests. I would naively expect this coefficient to be negative, so maybe there are confounding variables related to how this variable is measured for each country…?