On the third week of this course, we looked at large scale mapping of Twitter data. The researchers running the course to running a project that mapped the content of every Twitter account right across Australia, around 870 million tweets.
This is a departure from previous studies I have looked at that focus on specific hashtags. While the data sets are often very large and wide-ranging in content, they don’t necessarily provide a good context for Twitter usage – there’s a better explanation of this on page two of this document: http://eprints.qut.edu.au/82986/1/A%20Big%20Data%20Approach%20to%20Mapping%20the%20Australian%20Twittersphere.pdf
The method used was similar to in my previous Social Media course, in which nodes in a network were grouped according to nodes and edges, but in a more complex fashion than previously, due to the size of the data, as in this quote from the course material
For example, two nodes that share an edge may be close together, and closer still if that edge has a high weight. But as the algorithm works through all of the nodes and edges in the network, conflicts emerge: a group of nodes may all be strongly connected and so are bunched close together, but where should we place a new node that has just one strong connection to one node within that group?
This affect the type of visualisation that you need to use.
As with my previous course, this course focused on how nodes in a network take on different importance when regarded in different ways. For example, the most important node might be the one with the most direct connections, or it might be the one with the shortest routes to other major nodes – a node that connects several unrelated clusters.