While most of the post-event commentaries focused on the more counterintuitive findings of the first study – on the “Rationality of Political Online Space”, presented by Dr. Carol Soon, Research Fellow at the Institute of Policy Studies (IPS) – less attention was given to the second presentation by Professor Lim Ee-Peng of the School of Information Systems at the Singapore Management University on the “Possibilities of Machine Classification”, and how these technologies could help gauge political sentiments with online content.
Some of the findings from the first study include:
– The more political a blog is, the more objective it is.
– There is no relationship between the blogger’s identity and objectivity.
– There is no correlation between objectivity and partisanship for the government and the opposition.
– There is no relationship between emotionality and partisanship for the government or the opposition.
– Highly political blogs are more anti-government but they are also more anti-opposition.
These relationships are correlations, so for instance the objectivity of a blog could determine whether a blog is political.
At the “Assessing the Rationality of Political Online Space: Man and Machine” event on February 11, 2015 organised by the IPS, the participants in attendance raised questions on these findings, after Dr. Soon and Professor Lim concluded their presentations. In particular, they contested the definitions of “political”, how to ascertain if a blog was for or against the government or the opposition, as well as the corresponding limitations of content analysis.
Be that as it may, insufficient time – I figured – was spent on the possibilities for collaboration between “man and machine” to gauge political sentiments or opinions. The way forward, in other words. Traditionally, as it was done for the first study by Dr. Soon and Mr. Tan Tarn How, Special Research Fellow at IPS, key terms and characteristics must be defined academically, before coders are trained to look through hundreds of posts and thousands of posts. While these coders “can give accurate reading of natural language text”, in the words of Professor Lim, “the human approach is not scalable”.
I could not get my question across cogently at the session, but I had three after-thoughts:
1. Writing versus reading, presence versus impact. Distinctions must be made, since the presence of a blog or blog post per se does not indicate is impact. Different blog posts would attract varying readers, with dissimilar traffic. If political sentiments are to be gauged with political content, then consideration has to be given to their views and virality.
2. The relevance of the medium. This is tangential to the previous point, because if reading habits are included, the additional amount of dynamic data on the Internet and multiple social media platforms to be studied will be impossible for human coders to manage.
It was explained that blogs are more accessible, and that obtaining data from Facebook is plagued with privacy problems, though the trouble is that blogs may not be as popular (as they used to be), and therefore may not be as representative in the study of impact and sentiments. Notwithstanding the technical challenges, Facebook and Twitter could provide useful information for such studies. It becomes even more complicated as the sharing of an article or link on social media does not necessarily imply endorsement.
3. The agenda-setting impact of online content, and the incorporation of “machines”. After Mr. Tan spoke of a potential triangulation project – for application to the upcoming elections or for forecasting purposes (especially in the absence of polling data) – Professor Lim explained the processes of text and topic classification.
If the text analytics technique could be used to complement – not replace – human labelling efforts, how could the impact of political content be evaluated? Besides traditional hyperlink analysis, what about the influence on and mentions of online content in the mainstream media? What about parliamentarians or ministers? Could longitudinal studies be employed to determine if these online engagements do increase the level of discourse?
Could topic-specific analysis be conducted by “man” after some sampling by “machines”? For a particular subject or controversy – on the Central Provident Fund, homosexuality or gender identity for example – can the prevailing sentiment on the online space be weighed?