While the world is watching the Crimea crisis, the United States continues its “war on terror”. In March 2014 alone, the US carried out at least four drone strikes in Yemen, killing as many 7, possibly 12 people. We’ve covered drone strikes before on this blog. One of the main motivations of my earlier post on this issue was Obama’s pledge in May 2013 to reduce drone attacks as a means to combat terrorism worldwide. I drew on data from Josh Begley to quantify and visualize the frequency of US drone strike attacks since 2002. Here’s the plot from my previous post that clearly shows how the United States has increased the use of drone strikes since Obama entered office in 2009.

It’s been almost a year since Obama made the pledge to cut back on the use of drones. Has he really lived up to his promise? Luckily we have the data to answer this question unequivocally with yes and no. Yes, the US has substantially decreased the number of drone attacks since the beginning of 2013. But it surely hasn’t completely ceased to use drone strikes at a still considerably high average of about four drone attacks per month.

This assessment is based on a *change point analysis* of the frequency of US drone strikes per month. One of the good things of writing blog posts like these is that you always learn new things: until a couple of hours ago I had no idea that there was such a thing as change point analysis. But while I was revisiting my old post on the subject, I recalled a post on another blog about changes in ‘The Simpsons’ episode ratings. So, after digging up said blog post on DiffusePrior, I realized that I could apply the same technique that DiffusePrior used to assess change points in episodes ratings of a cartoon show to the much bleaker topic of US drone strikes.

Change point analysis aims to detect abrupt changes in time-series data. Typically, it is used to detect significant shifts in particular properties of different sequences of the data, such as the mean or the variance. It is employed in wide range of fields from climate science (“Were there really shifts in the mean temperature over time?”) or economics (“Was there a change in average GDP or market indices over time?”) and many other areas. There is a vast, very technical literature on the topic, since change point detection is a non-trivial subfield of signal detection in the context of noisy data and can be computationally very complex. I do not claim to have studied this topic extensively, so if any of the following interpretations or methods are blatantly wrong, please feel free to add a correction note in the comments section.

We can use change point detection to see if there have been any significant changes in average US drone strike activity since 2002. I’m concentrating on changes in mean drone strike activity. (Another possibility (not explored here) would be detecting changes in variance.) I try to detect points at which we can detect significant shifts in the mean of drone strikes over a particular sequence of months. “Significant” in this context means *statistically* significant: the software tests whether any detected change point is significantly different from the mean after the other change point. Whether this difference is also *substantively* significant must be established by the researcher.

So let’s dive into it. The R package changepoint gives us the tools to analyze the data (see also the useful vignette for an overview). I use an updated version of the script in my previous post to download and clean the drone strike data. The cpt.mean-command takes a time series object and calculates change points in the data. We need to give the command a method of estimating these break points, however, and there are a quite a few of those methods available. So, which one do we take? I opted for the “Binary Segmentation” method, because it has been around for a long time. Another method might be more appropriate, so let me know if you would choose a different method! However, the “Binary Segmentation” method requires that you give it a maximum number (“Q”) of change points to search for. But how many do we choose? Since we do not know *a priori* the possible number of change points in the data, I chose to loop over a wide range of change points (1 to 20) to ensure that any result I obtain is not an artifact of the initial number of change points assumed. If there are change points that are stable over a range of Qs we can be more confident that this is an actual change point in the data. Here’s the code that I’m using:

######################### # Change point analysis # ######################### library(changepoint) library(fields) # for xline library(Cairo) library(car) # calculate changepoints for(i in 20:1) { CairoPNG(file=paste0("./figs/changepoint", i,".png"), pointsize=11, width=1280, height=522 ) mean.drone <- cpt.mean(drone_sum_month$CountMonth, method='BinSeg', Q=i) par(mar=c(6, 4, 4, 2)) plot(mean.drone,type='l',cpt.col='red',xlab='',ylab='Frequency of US drone strikes',cpt.width=2, main="Changepoints in US drone strike activity", xaxt="n") axis(1, at=cpts(mean.drone), labels=format.Date(drone_sum_month$Dates[cpts(mean.drone)], "%b-%y "), las=3) xline(which.names("2009-01-01", drone_sum_month$Dates), lty=2, col="blue3", lwd=2) # 1st Obama presidency xline(which.names("2013-01-01", drone_sum_month$Dates), lty=3, col="blue3", lwd=2) # 2nd Obama presidency xline(which.names("2013-06-01", drone_sum_month$Dates), lty=4, col="chartreuse4", lwd=2) # promise to reduce drone strikes dev.off() }

I’ve animated the resulting .PNGs as a .gif to visualize the change point analysis. For a high-res version of the .gif click on the picture or see here (although I’m not sure how long the latter resource will be online).

The two blue lines represent the inauguration of the Obama administrations I and II, the green line indicates the month when Obama made the pledge to reduce drone attacks as a weapon in the war on terror.

We see that the software initially detects quite a high number of change points, but several dates remain stable if we reduce the maximum number of change points the software is looking for. One is clearly July 2008 when we observe a marked increase in the frequency of drone whereas it was only a sparsely used instrument in the previous years. Another stable change point is August 2010 when we see a considerable spike in the frequency of drone attacks. In fact this spike is so large that most iterations of the software mark the period between August 2010 and December 2010 as separate change point sequence. December 2010 also seems to be a major change point which is stable over most change point calculations regardless of Q.

Most interestingly, January 2013—the beginning of the second Obama administration—is one of the most stable change points over most calculations. This indicates that 2013 has indeed seen a significant reduction of drone strikes compared to the other months and years. It is noticeable however that this change point is apparently *before* Obama made his promise. In fact, we see a sharp increase of drone strikes right *after* his speech in late spring of 2013. One possible interpretation could be that Obama’s pledge only reflected a change in strategy and practice of his military, rather than the military following Obama’s directive. I don’t have the necessary insights into the politics between White House and Pentagon, so please let me clearly label this interpretation as pure speculation (if anybody with more knowledge on wants to weigh in, please feel free!). Here is the change point analysis at Q=5 which looks fairly reasonable:

Also, we should be careful not to overinterpret the results of the change point analysis. We do not have many data points which makes the estimations a bit shaky—usually more data is better. But since we are analyzing drone attacks that have already killed thousands of people, we should be happy that we don’t have more data. Also, as mentioned in the earlier post on this topic, the numbers on drone attacks are the best we have, but that doesn’t mean they are always *correct*. The administration does not provide official figures, so all data we have is based on news reports and NGO observations. We can safely assume that the actual numbers are higher.

The data does, however, allow me to at least give a tentative answer to the question from the beginning: It seems that Obama did, in fact, reduce the frequency of drone strikes, as promised. Nevertheless, at an average of four drone strikes per month, killing more than 180 persons since 2013, US drone strikes remain a very much active weapon in the war on terror. Given the high civilian death toll and the murky legality of this activity, we have therefore still enough reason to be concerned.

**Edit: **Jay Ulfelder has reminded me on Twitter :

Being new to this entire change point business, this is something I didn’t know and therefore clearly forgot to mention in the post. But it makes intuitive sense. Obviously, the observed pattern could change once we have more data, especially if we add more data from 2014. (But as I pointed out, let’s hope we won’t see more data points in the future, only less. Or zeros.) I’ll make sure to re-run the analysis in a couple of months from now to see whether this holds up.

As usual, I like this a lot. Great post! However, I am a little puzzeld about the added value of change point analysis. Could you explain a little more why we need this complex method to answer questions like these. By simply looking at figure 1, I can tell that since 2013 there is a reduction in drone strikes.

Best

Felix

Great question. From what I’ve read on the topic, it seems as if the added value of change point analysis here is to assess whether the detected change point of January 2013 is an actually

statistically significantchange point, relative to the variation in the rest of the data. Because it could be simply noise or regression to the mean given the high variation in the frequency of drone strikes between months.But you’re absolutely right, usually eyeballing the data or looking at summary stats is the way to go to detect change points before applying any complex method of estimating them.