Measuring Media Bias

mediabias.co.nz purports to provide an experimental measure of political reporting bias in NZ media sources, using a machine learning model that evaluates the sentiment of sentences from news articles mentioning MPs and political parties.

Although the technology side is using fairly standard and well-known processes, it’s not precisely clear what any of this means. The relationship between sentiment and political reporting bias is neither explained nor justified.

Let’s unpack what we can, going through what they’ve shared about how their analytics pipeline works.

The procedure:

1. Compile a list of political entities (MPs and political parties) and group them into two sets: left wing and right wing.
2. Download, extract and filter the text of news articles from NZ media sources based on matching against the list of political entities.
3. For each selected news article, split the text into sentences.
4. Run each sentence through a named-entity recognition (NER) process and select all sentences that match the mention criteria for political entities.
5. Run a sentiment classifier on each filtered sentence to return a positive/negative valence score.
6. Use the classifications and sentiment scores from all matched sentences to calculate coverage and ‘political leaning’ scores.
7. Confidently display the results.

Nobody would bother to spend time criticising this if the project was blandly described as an index of political mentions and sentiment scores but the stated goal of detecting media bias is much more ambitious.

The website does contain a short discussion of potential flaws in the procedure which is written from the perspective that the methodology is basically sound and just needs a few small tweaks to improve. Unfortunately, just drawing any attention at all to methodology throws up a rats nest of persuasive problems for the site. It has to convince readers that not only does the process work (in that it leads to usable results rather than unstructured mess) but also that the methodology makes sense from a media studies and computational linguistics perspective and can meaningfully achieve what it sets out to.

Technically, most of the procedure is structured around getting to the point of extracting sentences so that stats can be derived from those sentences to keep score. “We’ve got our sights trained on you media,” they seem to want to say. “Every time you run a sentence about The Green Party, the bias score goes up!”

What’s uniquely bizarre about this is the implied notion that the presentation of media bias is dependent on producing a movement towards or away from an idealised perfect equilibrium where every political element is mentioned equally. Is the disparate daily chaos of MPs and parties being mentioned in news articles somehow supposed to cancel out to zero for any given publication? I don’t understand it.

The first thing to look at with any applied analytical or statistical method is whether reasonable and believable results obtain. For example, with a physics problem, we want the results of plugging inputs into an equation to be within a broad range of realistic values and not totally weird or physically impossible. The same is true in a lot of computing and stats situations. Doing a quick eyeball check immediately helps confirm everything is set up end-to-end, and that a procedure is doing roughly the right thing and is ready to support diving deeper. This is not to say checking results by intuition is infallible or rigorous enough in many situations (and there are examples of it going catastrophically wrong) but before going any further on a data project, it’s important to spot outliers, oddities and unexpected patterns that indicate bugs in the underlying setup or flaws in the method.

It’s hard to ignore the irregular and unexpected patterns here. When I first looked at it, Newstalk ZB was rated further left than The Spinoff and Kiwiblog was rated further to the right of The BFD. Results that will come as a big surprise to anyone who knows these publications. Almost every single source analysed is rated as slanting to the left (the only exceptions are explicitly partisan publications Kiwiblog and The BFD), which again, will surprise anyone broadly familiar with NZ news coverage (and also seems to be a fairly obvious consequence of Labour being a majority government, and therefore being mentioned more).

As far as an eyeball check for reasonableness and believability, it’s a big problem if your analytics system spits out an assessment of media bias that roughly accords with the conspiratorial claims of deplatformed white supremacist Twitter accounts like DemocracyMum and RedBaiter, while confounding many observers with actual experience doing discourse analysis and computational linguistics.

If haters want to object to that characterisation as a partisan statement itself, consider how frequently media coverage over the past few decades has amplified the opinions of neoliberal economists, bankers and bank economists, corporate executives and lobbyists, concisely articulating their frames of reference and views on political economy. Obviously these claims are not exclusively presented and are often published alongside contrasting or opposing views but any claim that right wing views are not continuously represented in NZ media makes very little sense in terms of what’s actually being published.

We could dive into all sorts of technical details about the problems with drawing conclusions from sentiment analysis and the difficulty of extracting topic sentences or unpack the value-laden assumptions about named-entity recognition that don’t acknowledge its dislocation from the semantics of nouns in a text. Even detecting the boundaries of sentences themselves is not a fully solved problem (it’s mostly there, but written language is extraordinarily complex and contradictory and no existing system can get it right 100% of the time).

Beyond these threats to validity, the basic analytical problem that stands out is the question of whether it makes sense at all to label all NZ political parties and MPs as categorical members of two sets—left wing and right wing—and to use this to draw conclusions about media bias.

Folk wisdom suggests there’s a broad ideological spectrum in Parliament, with ACT further to the right, Green Party further to the left, and National and Labour filling centre right to centre left positions. This is a reasonable heuristic for understanding where the parties sit in relation to one another but it quickly falls apart when used to classify media coverage.

In day to day media and pols, these categories are disputed and contested. Most people agree that right wing and left wing views exist, but ask anyone to precisely classify their political opponents and you’ll get contradictory and conflicting answers. Labour are claimed to be dangerous far left maniacs by some on the right, while many on the left say Labour are the stolid centrist inheritors of the Key/English legacy. Others assume the truth is somewhere in between. This gets even more complicated when considering individual MPs. NZ’s main parties are broad churches with members holding a huge variety of different beliefs and contrasting ideological and pragmatic positions on different issues.

There is no precise or reliable way to automatically decide whether any given media mention of a political party or MP has a direct relation to a left wing or right wing bias. The whole concept of ‘wings’ is a rough heuristic of political affect and ideological affinity not an exact descriptive classification. Sentiment analysis at the individual sentence level further complicates this as it may or may not reflect the overall sentiment of the article and its political positioning.

You simply can’t argue that mentioning an MP by name constitutes a reference to a specific ideological construct and therefore is an instance of those constructs being favoured or disfavoured. Language does not work like that. News does not work like that.

Splitting everything into binary set membership means discarding the concept of the left-right spectrum altogether. You’re either in or you’re out. Good luck explaining Muldoon and the Fourth Labour government under such a model—let alone the actions of successive NZ governments over the past decade. Equating bias and set membership elides distinctions between forces and human tendencies that actually do seem to matter day-to-day, such as capitalism, racism, centrism, progressivism, authoritarianism, managerialism and many others. Of course, all this raises the spectre of the 2D political compass and other multidimensional meme-fueled fever dreams beyond the simple 1D spectrum.

Why bother? There are only four parties associated with the right wing and left wing categories here. Why not just categorise the media coverage as directly associated with one party or another? It would mean there’s no need to explain, justify and defend the validity and coherence of left/right classification. A less spectacular claim but one that makes the actual shape of the data much more clear to viewers.

What does it even mean to have a left wing or right wing bias? Making at least some attempt to reference existing published definitions of media bias feels like it should be a prerequisite for considering whether to build a computational system that measures it.

Doing discourse analysis ‘by hand’ has been a major part of media studies and sociolinguistics for a long time and for good reasons. Software developers would do well to spend more effort learning about existing scholarship on media bias and understanding the results of previous research on NZ newspapers rather than trying to number eight wire their own contraptions on the fly.

Without careful research and attention to detail, it’s extremely hard to generate valuable insights on the relationship of NZ media to politics using algorithms alone. Far more likely is that experiments like this become grist for the meme mill as National Party figures desperately clutch at digital flotsam and jetsam telling them what they want to hear, while their ship steadily sinks towards the waterline of 20% polling.


Mark Rickerby is a writer, designer and programmer in Ōtautahi–Christchurch.

Kyle Church