DataBlog
By Guido Romeo
Three Italian projects make it to the Data Journalism Awards finals (plus: a special behind the scenes)
#doveticuri, Gaza’s Gas and Medici obiettori: three Italian projects are among the 72 finalists vying for the 2013 Data Journalism Awards.
On June 20, at Paris’s Hotel de Ville, the final event of the 2013 Data Journalism Awards will reveal this year’s winners of what has come to be a sort of Pulitzer prize in the field of data journalism worldwide. We are proud to announce that three Italian projects are running.
Jacopo Ottaviani’s inquiry on pro-life doctors, published on Il Fatto Quotidiano, is up for the “Data-driven investigative journalism - Big media” category, while in the parallel “small media” category you’ll find Cecilia Ferrara’s inquiry on EU funds destined to Gaza, published by the newly founded Irpi (full disclosure: I am on the association’s board, but did not take part in the inquiry), and “Wired Italia”’s Lo sai #doveticuri? (literally, “do you know where you are being cured?”), which includes the first map of Italian hospitals’ mortality rates (full disclosure/2: I am one of the authors).
The jury is currently working on making its evaluations, but popular vote counts too: you can express your preferences here.
The 72 projects represent a wonderful gamut of ideas, techniques and approaches to data journalism; Datablog will certainly cover the most interesting ones soon, but in the meantime I can tell you more about how #doveticuri – which I coordinated – developed. While the inquiry was not funded directly by the <ahref Foundation, the skills and knowledge acquired while carrying out the Foundation’s iData project were crucial, and we hope they will contribute to amplifying the civic effect of our efforts even more.
Here is a brief breakdown of how we worked.
1. Data. All the data we used were obtained from Agenas, the Agency of the Italian Ministry of Health for regional health services, which for years has gathered information to evaluate public and private facilities in Italy. These are some of the best data available in the field, and therefore did not require much “cleaning up” work. It should be noted that they are not technically open data, and that we were able to use them only after being granted access to a password-protected website (made available only to a few selected journalists and doctors). As “Wired Italia” was not among the media already selected by the press office of the Ministry, we had to make a formal request for access. We collected data “scraping” them from the site with the help of Bologna’s HacksHackers (a group founded during the 2012 International Journalism Festival, as a spin-off of the <ahref Foundation’s iData project).
The database we obtained is an Excel spreadsheet of 1,200 rows (one per hospital) with 19 indicators of causes of death. We are currently working on a way to make it completely open.
2. Timing. We collected our data at the end of 2012, but we really started to work on them only in December, in order to publish the inquiry in print in March (and online in early April, with a slight delay due to some mapping issues). If we had had the chance to work on the project full time, two weeks would have sufficed to complete it.
3. Resources. #doveticuri was the result of the work of three journalists at “Wired Italia” (Marco Boscolo, contributor; Denis Rizzoli, an intern at the time; and myself, editor), one information designer (Massimiliano Mauro) and one computer scientist, Fabio Dellutri from Mitecube, who had previously worked on the ScuoleSicure project.
4. Costs. Expenses (excluding “Wired Italia” journalists’ and staff’s wages and fees) were around 1,000 euros, mainly for mapping.
5. The map. Hospital geolocation was carried out in the editorial department with QGis software, after combining the names of the structures surveyed by Agenas (whose database did not include addresses) with addresses and data about size, provided by the Ministry and by Istat. The geolocation’s quality was generally quite good, but 10% of the structures required a revision. The editorial team designed the graphic aspects, but the final display solutions emerged from consultations with Mitecube. The outcome is a map that can be also viewed on mobile devices.
6. Feedback. Feedback was predominantly positive. Some criticism came not so much from medical professionals as much as from people working on the communication of health issues, who were worried that readers might get a partial or distorted idea of healthcare in Italy, or might not understand the meaning of the data they view.
Before publishing we discussed these issues among ourselves and with specialists in the field, and we concluded that these data were of such interest for the public that it would have been unethical for us as journalists not to make them available. We published them trying to explain their meaning as thoroughly as possible.
Will these data stir up some trouble? Frankly, we hope so. Just think of the great differences in mortality rates between hospitals in the country. Healthcare is always a crucial issue during political campaigns, and we believe that citizens living in various parts of Italy should demand better service in their Region.
7. Next steps. As expected, our database yields a myriad further stories that we are beginning to develop. In addition, we published data from 2012 but have those from 2007 to 2011 up our sleeve, and will begin to experiment with them soon – so you’ll hear more about the project.
If anyone has any ideas for “civic” projects (that don’t pose any risks in terms of editorial conflict), we’d love to talk.
8. Improvements. We are definitely working on system usability, and on ways to foster an open discussion about the issues addressed by the project: we want to do better on these two aspects, so of course any suggestions are welcome.
Accuracy, data, access and participation: the spin-offs of civic media
Civic media are a hot spot of innovation with big and small upshots, and have generated a wide range of spin-offs in the last few years.
What are civic media? According to Henry Jenkins, of the MIT Media Lab, civic media are one of the collision points between means of communication and civil society’s awakening. They are a hybrid that exploded with the success of new media and their social dimension, with the potential to often cross over to mainstream media as well. But first and foremost, civic media are one of the moment’s most interesting and prolific areas for research and innovation. The <ahref Foundation was probably the first in Italy to embark on a specific mission about civic media and to foster the growth of an environment of projects, reflections and initiatives that don’t always share the same financial basis, yet are deeply connected by a common philosophy. We could borrow the concept of “punctuated equilibrium” from paleontologist Stephen Jay Gould’s theory, and say this is an ecosystem in which various players share some fundamental genes, although each one expresses them in wildly different ways.
This is the case of some recently launched projects, which evolved from reflections made on the iData project, supported by the <ahref Foundation in 2010. A few days ago the first public event of Diritto Di Sapere (“right to know”) took place, supported by the Open Society Foundations with the goal of expanding access to information in Italy. The right to access information is the basis of laws such as the Freedom of Information Act in the United States, which allows American journalists to conduct their research so effectively that it often contributes to Pulitzer Prize winning reports. In Italy this right still needs bolstering, despite the impressive current developments sparked by the anti-corruption decrees and transparency principles enforced by art. 18 of the so-called “Development Decree 2.0” promoted by minister Corrado Passera. Diritto Di Sapere has recently published LegalLeaks, the first how-to guide about information access for citizens and journalists (download here).
Access – in terms of fundamental human rights, not technical problems – is also a central issue on the Digital Agenda for Europe and in the international Open Government Partnership debate. Indeed it is instrumental not only for today’s “trendy” open data developments – while sound access rights would be the best warranty for a useful and high-quality open data – but also for the kind of civil participation that is essential in modern democracies.
The fractions of DNA that iData and Diritto Di Sapere share predate these projects, and can be traced back to Philip Meyer’s “Precision Journalism”. Indeed, technical skills and tools to access information (data, as well as documents) need to move forward in synch to really make progress towards the goal of improving journalism and democracy. And new spin-offs of the civic-media philosophy are still to come. As Aron Pilhofer stated during the 2012 International Journalism Festival, the beauty (and the curse, I may add...) of Italy is the fact that there is so much room for media innovation. For instance, fact checking – an eyebrow-raising concept only a few years ago – has now become a popular issue online, and has even earned a spot in many national newspapers.
Widening the perspective, and thanks to the support of Perugia’s Journalism Festival – which year after year has established itsself as a phenomenal incubator of experiences –, a series of seminars called HacksHackers was recently held in Bologna and Milano (but could be replicated anywhere: it’s an open source format!) to encourage new collaborations between technologies and journalism. Again, you don’t need to be Craig Venter to pinpoint markers from the MIT and iData. Further along the road of evolution of this “species”, yet still in the same cloud of punctuated equilibrium, last week the Irpi was presented: a new center for investigative journalism, conceived by Guia Baggi and Leo Sisti, which is home to a part of data journalism that has already caught the attention of the foreign press.
Last but not least, Datajournalism.it will be launched during the 2013 International Journalism Festival, in a few months. The website will have the goal of connecting the people in the field of Italian data journalism – who might not be numerous at the time, but are growing – also thanks to the datajournalismitaly community, which currently has over 250 members.
Data Journalism Awards: winning stories in investigative journalism
We are pleased to announce the winning stories for the recent Data Journalism Awards, the first International contest recognising outstanding work in the field of data journalism worldwide.
These stories cover issues about anti-terrorism and public spending, health and education systems, always with a strong investigative push and great accuracy in data use. Among others, an inquiry by the Seattle Times on painkiller methadone overdoses and people income had already been awarded with the Pulitzer Prize.
Three of the 16 stories selected for the data-driven investigative journalism category (the other two were data visualisation & storytelling, and data-driven applications ) are the result of partnerships between media outlets and non-profit journalism programs, such as the Berkeley Investigative Reporting Program, the Center for Investigative Reporting and The Center for Public Integrity.
Published last Fall by US monthly Mother Jones, Terrorist for the Fbi explains how the FBI has built a massive network of spies to prevent another domestic attack. But the actual target become American citizens “at risk”, and the data set provided show the details about the prosecutions of 508 alleged “domestic terrorists”. Readers can even “play” with such data to check their own outcome.
Methadone and the politics of Pain, is another first-class investigation on the many distortions of the current US health system. Medicaid patients are encouraged to use a narcotic that costs less than a dollar a dose, insisting that methadone is safe. But hundreds die from it each year — and more than anyone else, it’s the poor who pays the price.
Both inquiries integrated several formats and tools, both online and on print editions, producing a variety of interactive infographics that directly engage readers on very controversial and often “forgotten” topics. They are perfect examples of a data journalism that supports truly important stories in the public interest.
More on the Data Journalism School by Ahref and Istat (#djs12)
Given the many data and tools freely available today, there is no excuse to avoid a flow of experiments and data journalism projects. Based on this suggestion by Simon Rogers on The Guardian’s DataBlog, here is a summary of the second day events at the Data Journalism School promoted by Ahref Istat in Rome (#djs12).
Stefano De Francisci dealt with principles and statistics behind visualizations, moving from Edward Tufte to Stephen Few (author of “Show with numbers”) and Hans Rosling, founder of Gapminder and developer of Statistic eXplorer, adopted by Ocse and Istat (and also the same engine of Google Fusion). Federico Geremei and Fabio Lipizzi focused on a critical use of source and repositories, while Tomaso Pisapia addressed the crucial issue of data access.
Paolo Ciuccarelli (Density Design Lab at Milan’s Politecnico) said that a graphic data presentation should not be reduced to a visualization, adding that a developing visual story about complex can never be neutral (despite Tufte). To support his point, Ciuccarelli illustrated the Napoleon’s March to Moscow in Russia (1912) byCharles Minard, the Data Visualization Serendipity by Joe Boeckenstedt, and Newsmap, a visualization of Google News.
Several tools introduced at the event seemed very easy to use right away: Many Eyes, Tableau Public and Google Fusion, already included in a toolkit by Elisabetta Tola; Visual.ly, to visualize social media data; an intuitive Infogram, and Fineo, a great tool for flow charts launched one year ago by Density Design.
Finally, Ciuccarelli threw a provocative idea: is data visualization another bubble ready to burst? Most probably that’s the case right now, but after that we will surely have an innovation wave of best practices and tools.
Good perspectives for Data Journalism in Italy
Data Journalism is gaining more and more attention in Italy, but how to translate and use csv files and spreadsheets in a journalism story? To answer similar questions, Ahref and Istat promoted the Data Journalism School – with support from Enel, the first major public corporation that made available its own datasets.
The event gathered in Rome 23 attendees and got started with a presentation by Elisabetta Tola (@elisabetta_tola) and Guido Romeo (@guidoromeo) titled: an overview of best practeces worldwide and Italy.
Ettore di Cesare and Vittorio Alvino, Openpolis "civic hackers", detailed the basic mechanisms and difficulties in the building of OpenParlamento and the upcoming OpenMunicipio. They also discussed the potentialities of open data in the upcoming future and their (not so easy) relationship with traditional media.
Istat’s Vincenzo Patruno explianed that journalism should embrace and re-use open data, taking advantage of ad hoc platforms such as socrata, datamarket and buzzdata, along with a Scraper, an easy-to-use plug-in for Chrome. Francesca Fuxa Sadurny talked about specific laws and procedures (particularly the 196/2003 Act) that relate to such issues in Italy.
Anna Maria Tononi, communication manager at Istat, proposed an in-depth analysis about the need to further increase a collaboration between open data and media outlets. In fact, TV is still the main source of information for most citizens, particularly in Italy (runner up are the press, and far distant, the Internet). How can we improve the overall distribution of data and their re-use by common citizens? «By promoting storytelling», said Tononi: all of us should learn how to tell good stories based on actual data, starting of course with online journalists.
Data Journalism Awards in Paris on 30 May 2012
The Data Journalism Awards, a sort of Pulitzer Prize for data journalism, will take place in Paris during the News World Summit 2012 at the opening gala dinner on 30 May 2012. The jury is chaired by Paul Steiger, managing editor of top investigative journalism newsroom ProPublica (www.propublica.org).
Last month, the 58 semifinalists were introduced at the International Journalism Festival in Perugia – including two Italian projects: Toxic Europe - , about toxic waste trafficking in Europe, and Peoplemovin on migration flows. A few days ago the finalists were also announced, including Peoplemovin by Carlo Zapponi. There three nominations are a great success for the Italian journalism community, particularly given their independent nature in a context where most other projects are proposed by large news organizations.
In any case, as Nelson Mauro points out in his Digital First, some of them are quite interesting ideas and provide good insights also for Ahref’s iData project.
For instance, Riot Rumors (http://www.guardian.co.uk/uk/interactive/2011/dec/07/london-riots-twitter) how misinformation spread on Twitter during a time of crisis (The Guardian) helps to quickly find and fix inaccurate news that spreads so quickly in current events, such as in the London riot last August.
Country equivalents – interactive comparisons (The Economist) provides an interactive map of Gross Domestic Product (GDP) data across the world – a useful tool in the on-going discussion about the GDP role for actual development and prosperity in any given country.
Every death on every road in Great Britain 1999-2010 (BBC) offers an interactive map covering 36,000 accidents that occured in the last ten years in the UK, with descriptions, icons, and other clickable options – something that would be very useful in Italy too.
Your School (The Australian) lists over 10,000 National schools with related performances, equipment and other specific data. Last but not least, Phone-hacking scandal: Who’s linked to who? (BBC News), an interactive map of people, events, and timeline about the scandal that has engulfed Rupert Murdoch and News of the World. A similar project would be crucial to expose the economic and political intrigues that characterize the Italian scene since a long time ago.
EU funding for middle schools in Southern Italy
Four regions in Southern Italy (Calabria, Campania, Sicilia, Puglia) received EU funding as part of the Programma Operativo Nazionale (PON) to improve teaching and learning processes in middle and high schools -- for a total of over 5 million Euro. This map provides more information such as funding amount for each province and locally financed projects.
View Distribuzione fondi PON Scuole medie in a full screen map
Data Journalism Awards
The Data Journalism Awards is the first International contest recognising outstanding work in the field of data journalism worldwide. The six cash prizes for a total of 45,000 Euro will be awarded at the News World Summit, (30 May - 1st June 2012, Paris), and submissions are open until 10 April 2012. In a project that could easily be dubbed as the “Pulitzer prize of data journalism,” the competition covers three categories -- Data-driven investigative journalism; Data visualisation & storytelling; Data-driven applications (mobile or web) – and includes stories published or broadcast between 11 April 2011 and 10 April 2012.
The president of the jury is Paul Steiger, Editor-in-Chief of ProPublica, the non-profit investigative newsroom based in New York, and member of the <ahref Foundation’s scientific committee. Other jurors are top executives of prominent publishing ventures, such as Thomson Reuters, The New York Times, and Les Echos. Launched by the Global Editors Network in collaboration with the European Journalism Center and with support from Google, the contest includes also the Ahref Foundation among its media partners.
In particular, the competition seeks to contribute to setting high standards and highlighting the best practices in data journalism and to inspire journalists by showcasing outstanding work. Also important is to attract the attention of publishers and investors interested in promoting ventures focused on a full integration of journalism and technology skills. In this context, some innovative projects are ProPublica’s inquiry on school disparity (the “opportunity gap”) and the investigative report on property insurance – both concerning the US State of Florida – carried out by Paige Saint John of the Sarasota Herald Tribune.
For this latter report, Paige Saint John was awarded the 2011 Investigative Reporting Pulitzer Prize. She will be speaking at the upcoming International Journalism Festival in Perugia, Italy (25-29 April 2012).
Data journalism: a mailing list for Italian users
We are pleased to announce a useful resource for Italy’s data journalism community. The mailing list <datajournalismitaly@googlegroups.com> has already more than a dozen subscribers and a great discussion plan. To be clear, this initiative is mostly due to the pressing support of Maurizio Napolitano, researcher at Fbk in Trento and Italian ambassador for the Open Knowledge Foundation, and to the extremely positive experience of the Spaghetti Open Data list.
Here is a basic list of topics we plan to cover in the near future:
- best and most inspiring practices in data journalism in Italy and abroad;
- reviews of tools for scraping, analysis and visualization techniques;
- sources and ways of obtaining data when not readily available;
- info on conferences and relevant events worldwide.
Above all, we do hope this initiative will stimulate new contributions and collaborative data-based narratives.
For more info and to subscribe: http://groups.google.com/group/datajournalismitaly
Guido Romeo
The image "My C: Drive", used for the home page, is by bsimser and released under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Generic (CC BY-NC-SA 2.0)License.
New ebooks on data journalism
Will data save journalism? Of course not, but to create innovative information in the public interest we cannot underestimate these tools that are producing valuable results for both media outlets and citizens in the English-speaking world.
In our continuing effort to support this trend, the <ahref Foundation is gathering a sort of "data toolkit”, currently in its beta version and aimed at reporters (in a broader sense) interested in applying these techniques.
Given the few resources available in Italian, we are pleased to announce the release of an ebook titled "Open Data e Data Journalism by Lsdi, an organization devoted to freedom of information. Produced by Andrea Fama, this release takes into account both open data and data journalism – thus integrating two different but convergent strategies that are taking their baby steps in Italy. This project also reveals a large need to increase the level of training and awareness both in the journalism community and in the society at large. (Full disclosure: the ebook also includes positive reviews of the iData research project promoted by the <ahref Foundation.
Finally, the non-profit Open Knowledge Foundation and the European Journalism Center are working on a comprehensive handbook on data journalism. Aimed at explaining “how you can approach data journalism from scratch with no prior knowledge,” this project is a direct result of a series of collaborative sessions held at the recent 2011 Mozilla Festival in London.
Stay tuned for more details about these and other exciting projects!


