Over the next few weeks of CDE, we have to produce a weekly handmade data visualisation along with a short piece of writing that explains what it is depicting (or, perhaps, more accurately, what it is aiming to depict). Then we need to say how it might be interpreted usefully as a reflection on the course themes (either ‘learning’, ‘teaching’, or ‘governing’). In particular, we need to give the reader — I guess that’s you — a sense of the decision-making processes that we went through when putting it together.
Anyway, Jeremy and Ben were wise enough not to throw us in at the deep end, so last week we produced a trial data visualisation. You can see mine just below. (I posted it on the forum in Week 2 but I’m only getting a chance to finish up what I’ve written about it now because semester just started where I work and it has been hectic!) The task was to gather data on some aspect of our lives and then to put together a handmade data visualisation to represent it — or, that at least aimed to represent it, and then to say something about the decision-making processes in a way that weaves in some themes from the readings we’ve done so far.
First step: Choose something to track. That, for me, was the hardest part: there were so many possibilities to choose from! I was guided by the thought that there were certain things I didn’t want to track and then discuss in public, so I ruled those out. But even when I narrowed down the remaining space of possibility to my fruit consumption, there were still questions about how to individuate what I was tracking — how finely should I individuate apples, for example? Should I lump Granny Smiths, Pink Ladies, and Cox’s Pippins together under the general category ‘apple’? Or should I distinguish them, giving each its own categorisation and attendant category in the visualisation? In the end, I chose the path of simplicity and ran with the general category ‘apple’. Simplifying made things more convenient. For me, at least. Here’s the key with the categorisations. I say more about it below.
Second step: I tracked day-by-day from Monday until Friday. But I found that noting down the exact time I had an apple or clementine slightly distracting from the other things I was doing, so I decided to pare down the way I was going to measure and record the time at which I ate any given portion of fruit. You can see on the key above that I don’t record the exact time I’m crunching on a given apple, for example. That decision was not just about convenience either. I decided that you, dear reader, don’t need to know whether I am eating my apple at 8am or 5pm. But I do record whether a token piece of apple-eating happens before or after a token piece of banana-eating, so you are not totally left in the dark. I hope you don’t mind that I choose so carefully what to tell you about my life. You see, I like being control of data that’s about me. And just as I’m choosing quite carefully what to tell you in this visualisation, I’m choosing even more carefully what not to tell you.
Third Step: More questions, more categorisations. As I was doing this, I started to see that the process for generating the data for this task was framed by asking a series of questions — about fruit consumption, in this case. One significant thing about questions is that they are always asked from a point of view — the point of view of a questioner who often has an interest or even a stake in asking this or that. The readings don’t really think of the process of generating data this way (although, interestingly, Lupi and Posevec mention it in passing) but this thought still coheres with Gitelman and Jackson’s claim that data are not raw: data are not “before the fact” (p.2), and so we shouldn’t place trust in their “neutrality and autonomy, their objectivity” (p3). It coheres less with Williamson’s (2017:29) claim that “data need to be understood as social products”. Do they? If data are generated by questions, and if questions can be asked by a lone (and not especially social) questioner, then data do not need to be understood as social products, even if it is in the case that data are, in fact, most of the time, social products. Let’s see how they might be social products.
Another significant thing about questions is that they can be asked by groups of people as well as individuals working by themselves: a group can work together to figure out an interesting question to ask. It will probably take longer for a group to choose a question and the process will probably be subject to negotiation too. Different people in a group might have different interests and this means that they’ll have to negotiate back and forth about what question to settle on. More significantly, some people at the negotiation table might have more power and influence than others when it comes to formulating the questions that generate data. And most significantly, some people might not be at the table at all. It is in this way that data can be (and probably mostly are) social and, indeed, political products, especially when some people are left out of this process.
Key: interpretation and ambiguity
I guess one of the most interesting things I got from this was an understanding of how data are generated. Gitelman and and Jackson (2013: 3) write that “[d]ata need to be imagined as data to exist and function as such, and the imagination of data entails an interpretative base.”, and these acts of interpretation are crucial to understand. Interpretation itself is a social act involving an audience (or, at least an imagined audience); and just you might interpret me more or less charitably when I say something, you might also interpret a visualisation more or less charitably too. Interpretation is often going to be a social, and even, at times, a political process. In addition, I’m inclined to think that the acts of categorisation that pepper this whole process are really worth thinking about carefully. I categorise five berries as being a portion. But why are five a portion? Why not four or six? Making this choice felt contingent bordering on arbitrary and surely whether five or six berries count as a portion is kind of vague and borderliney in a way that might cause difficulties when it comes to interpretation. In the case of berries, this may not matter. But now imagine you are trying to categorise who counts as, for example, being a woman. Recently, whether or not some transgendered people count as women has become hotly politicised. Our categorisations may sometimes be anodyne; they are never neutral.
Finally, you might have noticed that some of my categorisations under ‘Verdict’ sound kind of subjective — ‘Tasty’, ‘Meh’ and ‘Yuck’ are going to be hard to measure and interpret for that reason and I chose these categorisations to make that point. There’s a lot more to say about this than I can say here. It’s also ambiguous whether the audience should read the sections in the lower right hand corner ( “Verdict’ and ‘Company’) across or down. There’s more to say about that too. But I’ll stop here for now, I think.
I really enjoyed your visualisation – the colours being in tune with subject was a nice touch.
I liked how you recognised the dilemmas many (?) of us are facing, between simplifying and retaining the complexity of data, and whether to go with more objective or subjective descriptors.
And, of course, ‘Who is the audience?’, ‘What is this all for?’, ‘How could this be interpreted?’