Vendata platform combats misinformation in Venezuela through open data

نوشته Alessandra Monnerat
Oct 30, 2018 در Data Journalism

In Venezuela, a country where restrictions on the press are rampant and access to public information is not guaranteed by law, being informed about government decisions can be problematic. That is why the Press and Society Institute (IPYS for its acronym in Spanish) Venezuela, with Transparencia Venezuela, is launching Vendata, an online platform used to easily display information contained in the “Gaceta Oficial,” the official government bulletin.

A gaceta, a government publication containing resolutions, decrees, administrative orders and acts, “is the only imperative with which those who govern must comply daily. Everything is in the Gaceta Oficial,” according to a release from the Vendata team.

According to the project’s coordinators, Venezuelan journalists Katherine Pennacchio and Arysbell Arismendi, making that information more accessible would be a great help to Venezuelan journalists.

“We could talk for hours about the challenges that a journalist in Venezuela faces in order to get information,” Arismendi said to the Knight Center for Journalism in the Americas via email. “There is a refusal of officials to give information. Normally a single spokesperson is authorized to give official information which makes the process of obtaining data more bureaucratic. There is no law on access to information. Therefore, the State or public bodies are not obliged to provide data.”

In its 2015 annual report, the Office of the Special Rapporteur for Freedom of Expression for the Inter-American Commission on Human Rights affirmed that the country hasn’t adopted legislation on access to publication information “and has not published or handed over information that is undoubtedly of public interest, such as health matters or information regarding public accounts.” The office added that the justice system, on occasion, has also denied guarantees to access information.

The main goal of the Vendata team is to make the project the most important data platform in Venezuela. To do this, the team is creating a search engine and a massive open-sourced database available to journalists, investigators and the public in general and is presenting it in a way they can understand and reuse the information, according Arismendi. The web platform will also have a public API (application program interface), which allows the information in Vendata to be read and crossed with other databases.

Vendata is in its implementation phase and IPYS Venezuela is looking for volunteers to help organize the data. Escuela de Datos, a data literacy organization, is helping to spread information about the project to potential collaborators and the branch of the organization in Mexico offered advice with the project in its infancy, Pennacchio said.

For now, 20 people from different professions and regions of Venezuela are extracting information from the Gaceta. Pennacchio added that they hope to formally launch the project in July.

“Anyone wishing to collaborate with the project only needs to have internet at home, a computer and a lot of patience,” Arismendi said. “The Vendata team has also organized data release meetings in Caracas, so while information is being compiled, participants learn to use scraping tools that can be useful in their fields.”

Scraping involves extracting information from documents. Since the Gaceta is saved as a PDF, is dirty and has no format, the team had to scrape the issues manually, Pennacchio said. The documents are also long, hard to understand and difficult to read.

That means that the appointment of an ambassador, the approval of a law to increase the minimum wage or interest rates set by the Central Bank of Venezuela might be buried in the extensive issues of the Gaceta, Arismendi explained.

Additionally, collaborating with the project means going against a non-cooperative trend in Venezuela, Arismendi said.

“Another difficulty we have encountered is that in Venezuela, the concept of ‘collaboration’ is not as ingrained as in other countries,” Arismendi said. “It has not been easy getting people who are eager and committed to participate. However, we are doing everything possible to change that trend. We hope to contribute to the growth of a culture of open data in Venezuela.”

Besides the difficulties in understanding the documents or in creating a collaborative environment, the Vendata team has to face technical problems.

“The platform we use to scrape the information is in Beta version, so we are finding errors and are improving it as we use it. We are learning from trial and error. Furthermore, Venezuela has the lowest internet bandwidth in the region; it has affected us greatly and has delayed the work,” Arismendi said.

The Vendata team hopes to continue working towards transparency in Venezuela. After scraping information from the Gaceta, Arismendi said they are moving to other official documents in non-reusable formats.

This post originally appeared on the Knight Center for Journalism in the Americas' blog and is republished on IJNet with permission.

Main image of Caracas, Venezuela CC-licensed by Flickr via Walter Vargas. Secondary image courtesy Dagne Cobo Buschbeck/Vendata team.