Labs newsletter: 21 November, 2013

NOVEMBER 21, 2013

This week, Labs members gathered in an online hangout to discuss what they’ve been up to and what’s next for Labs. This special edition of the newsletter recaps that hangout for those who weren’t there (or who want a reminder).

Data Pipes update

Last week you heard about Andy Lulham’s improvements to Data Pipes, the online streaming data transformations service. He didn’t stop there, and in this week’s hangout, Andy described some of the new features he has been adding:

  • parse and render are now streaming operations
  • option parsing now uses optimist
  • a basic command-line interface
  • … and much, much more

Coming up next: map & filter with arbitrary functions!

Crowdcrafting: progress and projects

New Shuttleworth fellow Daniel Lombraña González reported on progress with CrowdCrafting, the citizen science platform built with PyBossa.

CrowdCrafting now has more than 3,500 users (though Daniel cautions that this doesn’t mean much in terms of participation), and the site now has more answers than tasks.

Last week, the team at MicroMappers used CrowdCrafting to classify tweets about the typhoon disaster in the Philippines. Digital mapping activists SkyTruth, meanwhile, have used CrowdCrafting to map and track fracking sites in the northeast United States. Daniel has also been in contact with EpiCollect about a project on trash collection in Spain.

Open Data Button

Labs member Oleg Lavrovsky discussed the Open Data Button, an interesting fork of the recently-launched Open Access Button.

The Open Access Button, an idea of the Open Science working group at OKCon 2013, is a bookmarklet that allows users to report their experiences of having their research blocked by paywalls. The Open Data Button applies this same idea to Open Data: users can use it to report their problems with legal and technical restrictions on data. (As Rufus pointed out, this ties in nicely with the IsItOpenData project.)

Queremos Saber

Labs ally VĂ­tor Baptista reported on a new development with Queremos Saber, the Brazilian FOI request portal.

Changes in the way the Brazilian federal government accepts FOI requests have caused Queremos Saber problems. The federal government no longer accepts requests by email, forcing the use of a specialized FOI system which they are now promoting for local governments as well. This limits the number of places that will accept requests from Queremos Saber.

A solution to this problem is underway: an email-based API that will take emails received at certain addresses (e.g. [email protected]) and turn them into instructions for a web crawler to create an FOI request in the appropriate system. An interesting side effect of this would be the creation of an anonymization layer, allowing users to bypass the legal requirement that FOI requests not be placed anonymously.

Philippines Projects

Labs data wrangler Mark Brough showed off a test project collecting data on aid activities in the Philippines. Mark’s small static site, updated each night, collects IATI aid data on projects in the Philippines and republishes it in a more browsable form.

Mark also discussed another data-mashup project, still in the planning stage, that would combine budget and aid data for Tanzania (or any other developing country)—similar to Publish What You Fund’s old Uganda project but based on a non-static dataset.

Global Economic Map

Alex Peek discussed his initiative to create the Global Economic Map, “a collection of standardized data set of economic statistics that can be applied to every country, region and city in the world”.

The GEM will draw data from sources like government publications and SEC filings and will cover eleven statistics that touch on GDP, employment, corporations, and budgets. The GEM aims to be fully integrated with Wikidata.

Frictionless data

Finally, Rufus Pollock discussed and the mission of “frictionless data”: making it “as simple as possible to get the data you want into the tool of your choice.” aims to help achieve this goal by promoting, among other things, simple data standards and the tooling to support them. As reported in last week’s newsletter, this now includes a Data Package Manager based on npm, now working at a very basic level. It also includes the Data Package Viewer, which provides a nice view on data packages hosted on GitHub, S3, or wherever else.

Improving the Labs site

The hangout wrapped up with a discussion of how to improve the Labs site. Besides some discussion of the possibility of a one-click creation system for Open Data Maker Nights, talk focused on improving the projects page.

Oleg, who has volunteered to take the lead in reforming the projects page, highlighted the need for a way to differentiate projects by their activity level and their need for more contributors. Mark agreed, suggesting also that it would be nice to be able to filter projects by the languages and technologies they use. Both ideas were proposed as a way to fill out Tod Robbins’s suggestion that the projects page needs categories.

See the Labs hangout notes for the full details of this discussion.

Get involved

As always, Labs wants you to join in and get involved! Read more about how you can join the community and participate by coding, wrangling data, or doing outreach and engagement, and have a look at the Ideas Page to see what other members have been thinking.