By Guido Romeo
By Guido Romeo
"Data is the future of journalism" Sir Tim Berners Lee stated a few months ago. Data journalism owes a great deal to the inventor of the web, as it is now more than fifty years since that data has been archived digitally, the development of the web has created unprecedented possibilities and there are already many editorial boards which are experimenting with new uses for the data that can be used to create information.
Geoff McGhee, who worked at the NYTimes, AbcNews and Le Monde interactif with the help of a John S. Knight Fellowship collected together in a video the stories about many of the most significant episodes of data journalism. The video (below is an extract, on the site of Stanford University there is the full version) is very accessible and has become a “classic”.
Among Geoff’s conclusions are the following points which are worth highlighting and are all points which this blog will be analysing in greater depth:
• The analysis of data requires new instruments, above all in editorial offices which are by no means fitted out like statistical bureaus.
• Above and beyond the analysis of data, a very important issue is the visualization of this data where statistics, journalism and artistic sensibility all merge into one.
• Journalists are developing a new ability to write articles using data.
• There is no shortage of traps: many visualizations of data look great but do they really tell a story?
• Real time data: with the widespread use of mobile phones and the web the amount of real time data is increasing dramatically and this is a very intriguing scenario.
Just think what could be achieved with the anonymous data of a million inhabitants of a city, to reconstruct the flow of traffic during the rush hour... or the data on electoral polling (the NYTimes did this for the election of Obama in 2008 and it was repeated last year for the congressional elections).
This blog is the first public appearance of iData, the project supported by the Ahref Foudation, which aims to develop the first Italian platform for data-driven journalism.
“Data-journalism" is a fast expanding phenomenon especially in the United States and Great Britain. Among the best examples are the work of Paul Bradshaw and those of ProPublica's which with its Recovery Tracker has organized the data relating to the anti-crisis measures adopted by the U.S. government to stimulate the economy, into a freely accessible database. Other newspapers have done similar work and produced more detailed articles, making data-journalism a formidable instrument of transparency and democracy.
It is journalism for geeks? No. It is journalism that asks you to respect all the old trappings (hypothesis, research and checking, and of course elbow grease...), but which uses software, often tailor-made, to elaborate the mass of data made available through digitalisation which is often meaningless unless dealt with using tools that are sufficiently powerful. A recent example is the excellent investigation “Dollar for Docs” by ProPublica which has identified and located the workplace of more than seven thousand doctors in the U.S. who have accepted compensation from drug companies.
In this light, data-journalism is not an arena reserved for investigative journalists and “code magicians” but an open playing field for individuals, associations and stakeholders in the widest sense.
In Italy, unfortunately, there are no public data repositories as rich as data.gov in the U.S. and data.gov.uk in the UK and the legislation that ensures access to public data such as Law 241 of 1990 and its subsequent reviews is still very far from what is provided in London and in Washington.
In Italy, however, there has been a proliferation of open-data experiments by institutions such as Dati.Piemonte and Open Data launched by the city of Udine, as well as initiatives such as the transparency initiative that monitors the spending by the National Parliament launched by the Italian Radical Party, Agora Digitale, and Valigia Blu.
In a scenario in which the digitalisation of data is still too often perceived as the Achilles heel of traditional media, the use and the creation of open source data is emerging as a great opportunity for both information and for democracy because, in addition to creating new working tools for journalists, it is stimulating a new dynamic collaboration with and among readers, increasing both the involvement of civil society and transparency of sources.
But more pragmatically ... what is the purpose of iData?
We aim to develop the first Italian open source platform for data-driven journalism.
Under license by creative commons, the platform will be connected to a range of communities that can collaborate in the collection, production and processing of data.
The data may come from public databases, from public sources or be prepared ad-hoc by communities.
And finally, organize some of the basic tools already existing and develop new ones to help those who are not geeks, like myself, to give meaning to the data!