Implications of Big Data Collection to Journalists
Chicago 3D River created by Holly Storer
Data collection, especially mass data collection, has huge implications, both positive and negative, for journalists. Our society’s technological advancement has led to the mass cumulative collection of data that can serve as a great resource for journalists to gain relevant and factual statistical information.
On the other side of the spectrum sifting through data or finding information on specific geographical, societal and demographical genres can prove to be an overwhelming challenge. Furthermore, there is no way to standardize the way in which data is collected, which can lead to false accounts and therefore incorrect statistical information. Generally speaking, our ability to observe and analyze data is a notion, one in which the forefathers of writing research and documentation would envy.
Today, information relating to most everything we participate in is collected and archived. As a journalist, to have access to this kind of information is invaluable. For instance, a journalist in Illinois, looking for information regarding the locations of individuals convicted of sex crimes only has to look as far as the Illinois Sex Offender Registry via the Illinois State Police website and at their fingertips is statistical and geographical information per any address and area in Illinois.
This, however, is just a small piece of the abundance of data available.
At the Illinois Architectural Foundation, examples of data collected range from potholes, to disease demographics, to rat infestations and that’s just to name a few. To be able to draw from so many data resources allows for interconnectivity of topics such as the relationship between poverty and disease. For almost every journalist, the mass availability of information is essential to add insight to their story.
The other side of mass data collection can be rather negative in regards to gathering specific data collections.
Initially the idea for this essay started with an idea about gathering statistics regarding red light and speed monitoring cameras for journalists. While the original data collection for this idea was rather accessible through the Illinois Department of Transportation website, the more finite details of the impact red light and automated speed enforcement tickets have on individuals seemed to have been lost amidst the larger populous data. This is an instance where there can be implicit problems with too large of a quantity of data.
The general idea of data collection is to navigate large subject matter, thus leaving us to navigate the more niche subjects on our own. While this may be a problem now, as the future becomes the present, data will keep being broken down from extremely large pools to smaller and more categorical patterns.
Often data can be corrupt due to both human and mechanical error. If you think back to the days of high school science experiments, in the process of collecting information there always has to be a control group. In society there are no standardized control groups. Unless we have a hidden utopia of perfect humans among us, which we clearly do not. Therefore, there will always be error in data and these errors can cause huge problems in reporting.
An example of a data collection error is the Bush v. Gore election of 2000 where race statistics where massively incorrect regarding the states “Felon List.” According to, Warren D. Smith (PhD) of Range Voting, the implications of this error negated ballots of certain African American individuals due to their being in an incorrect felon status, which is legally unable to cast a vote. The end result of this error lead to the election of George W. Bush as our president while Al Gore was left grasping for the metaphorical straws. So as problems typically roll down hill, the data errors really affected the integrity of media reporting.
While data collection can be both positive and negative, the basic principal of “Big Data” is transparency. This transparency leads to clearer more concise reporting. Yes, there will be errors and misinformation, but ultimately, the big picture is being pieced together by none other than writers, journalists and other media personnel.