Congress has embarked on its new session and intense and important policy debates have begun in earnest. And as a new year begins at the EAC, another debate that has confronted data-lovers for decades continues in earnest as well. While not quite as important but possibly as passionate as debates on Capitol Hill, the question remains: do you say "data is" or "data are?"
At the EAC, we talk a lot about data, especially this month, as we look at the past, present and future of our Election Administration and Voting Survey (EAVS). When describing data, some here use "data are" (which has been my preference), while others use "data is."
We are not nearly the first group of people to have this debate. Those in the "data are" camp state datum = singular, while data = plural, hence "data are". Or as one of the smart people at FiveThirtyEight noted when this discussion was had before launching their website, "Grammatically… it has to be ‘data are.’ You can’t argue with the grammar."
Monica Crane Childers, Director of Government Services at Democracy Works, prefers data are as well. "There’s no "right" answer, of course, since we’ve come to accept both forms in American English. I tend to use 'data are' because you're usually referencing a data set, rather than a single piece of information. But I like that the phrase also underscores the multiple meanings or interpretations of any data set. The fact that data must be interpreted means that there’s never just one right conclusion, so the data are many things by their very nature."
However, those favoring "data is" counter hardly anyone uses the word datum or says datum point, for example. And along those lines, technically the word agenda is the plural of the word agendum, but today agenda is accepted as a singular. "Data are" feels anachronistic and awkward to me, even if it’s technically correct," said Doug Chapin, director of the Program for Excellence in Election Administration at the University of Minnesota and proponent of data is.
But Chapin thinks there is a more important reason to use data is. "The use of "are" suggests that each individual datum has separate value standing alone, when the truth is that the power of data is in its collective effect; namely, the opportunity to see a bigger picture from the aggregation of individual points. While it’s true that a single datum can be interesting – just like a grain of sand or a single snowflake – we tend to be interested in the collective impact rather than complaining that sand on the beach "are" hot or snow "are" covering our driveways. To me, that calls for a collective noun; hence, data *is* beautiful."
Other "data is" supporters also note "data are" doesn’t sound like how people actually speak. In fact, some suggest "data are" sounds well, a bit off-putting, obnoxious and pretentious. To that, I say, I’ve been called worse, but I am also in fact considering getting off my "data are" high horse.