Pages

Friday, June 2, 2017

The EU referendum: how did Westminster constituencies vote?

All the major party manifestos are now out, and we can see their plans are if they win the election: scrapping free schools meals, maybe, or renewing / not renewing Trident. But whatever else the parties say, or plan to do, we also know that probably the single most important issue in this election is Brexit.

Here, a tension has been evident since last summer: although overall the country voted narrowly in favour of leaving the EU, the majority of MPs were personally against this (including, most notably, Theresa May). The Conservative Party has been working hard to deal with this, and is reportedly vetting new candidates standing in constituencies across the country to make sure they’re confirmed Brexiters (and not, say, soft Leavers – or worse, outright Remainers).

But again, this poses a problem, as we don’t actually know how the national vote to Leave the EU broke down at constituency level – and so we don’t know where there might be mismatches between a Leave-voting constituency and a Remain-voting MP (or vice versa).

Six days after last year’s referendum, I sat down to try and work out how Westminster constituencies had voted. This turned out to be quite tricky, because the referendum result was counted by local authority areas – not constituency. The two only roughly overlap, and so I had to make some adjustments to get the calculations to work out.

I’ve recently had the final set of estimates (and the methodology for producing them) accepted for publication in an academic journal, and so I thought I would describe the steps I’ve taken to work out reasonable answers to what seems like a simple question.

Since the referendum I’ve published three sets of estimates:

    an initial set of estimates produced the week after the referendum;
    a second set of estimates produced in August, and upon which basis I initially submitted my article;
    a third set of estimates, which resulted from changes suggested during the review process

For the first set of estimates I built a statistical model which explained the Leave share of the vote in each local authority area using demographic characteristics. I then used that model to extrapolate from the demographic characteristics of Westminster constituencies.

The problem with this first set of estimates was that some were wrong. If a constituency overlapped perfectly with a local authority, this method wasn’t guaranteed to produce the (known) local authority results .

I fixed this problem with the second set of estimates. Here I built a statistical model which “explained” the number of Leave and Remain voters in each local authority area. I used that model to extrapolate from the demographic characteristics of groups of Census Output Areas. I then divided or multiplied these extrapolations as appropriate to make sure that they added up to the local authority totals, before adding these scaled extrapolations up to Westminster constituencies.

The problem with this second set of estimates was that I had not accounted for some relationships between demographic variables. For example, we know – roughly – that the referendum vote was highly correlated with education levels, as areas with greater proportions of people with university degrees tended to vote Remain.

But high levels of graduate qualifications mean something different in older constituencies compared to younger constituencies: older people had fewer chances to go to university, because participation in tertiary education was much lower when they were growing up. If, despite this, an older constituency has lots of graduates, this may matter.

I fixed this problem with the third set of estimates, which included an interaction term between age and the proportion of the population with higher educational qualifications.

Fortunately, this last change (which emerged during the review process) did not affect the overall story. These estimates are all very similar. The graph below shows, in the lower left, the pairwise scatter-plots of the different sets of estimates, and in the upper right, the correlation between the sets of estimates. The correlation between the second and third sets of estimates is very high indeed.

0 comments:

Post a Comment

 
Don't Forget To Join US Our Community
×
bloggerWidget