Monthly Archives: September 2015

How to create your own Missing Maps like PyBossa task

To support the Missing Maps Project I created a microtasking application to identify human settlements and roads in South Kivu that uses PyBossa . To create a detailed map, that can be used by organisations like MSF or Red Cross in the field, finding these areas quickly is especially important in sparsely populated regions like South Kivu. Thus, the results of the application will speed up the mapping process and  will be used as an input for a fine-grained mapping task in the HOT Tasking Manager. Finally, the map will be einriched with information directly coming from the field. Since the PyBossa task is the first step in the overall mapping workflow finishing it as soon as possible will make a great contribution towards creating the first high-resoluation map of South Kivu.

In this blogpost I will demonstrate what I did and explain the steps to set up a task in PyBossa in detail. If you have QGIS installed on your computer, you are ready to go for it.

Main workflow:

  1.  Create a pybossa project (e.g. at
  2.  Select area of interest and download boundary polygon.
  3.  Split area of interest into smaller rectangular polygons.
  4.  Import your tasks to your PyBossa project.
  5. Design your task presenter.
  6.  Find many contributors.
  7.  Export your results.
  8.  Visualize your results.

And now in detail:

1 – Create a pybossa project.

  • Visit and sign up or log in with your account.
  • “Create your project” and fill in basic information.

2 – Select area of interest and download boundary polygon.

  • OSM Boundaries 3.5 is a great tool to download administrative boundaries.
  • Export as ESRI shapefiles (.shp).

3 – Split area of interest into smaller rectangular polygons. (I used QGIS for this.)

  1. Open boundary polygon in QGIS.
  2. Calculate vector grid with the following settings.
    • update extent from canvas
    • set width and height (in units of the coordinate reference system, e.g. decimal degree for EPSG4326)
    • output grids as polygons
  3. Intersect vector grid polygons and boundary grid with the following settings.
    • input vector layer: grid
    • intersect layer: boundary
  4. Calculate x center and y center for each grid polygon.
    • Open the attribute table of grid polygons layer.
    • Open the field calculator and choose the following settings.
      • create new field
      • output field name: x_center
      • output field type: decimal number (real)
      • precision: 6
      • expression: “X_MAX” – ((“X_MAX”-“X_MIN”)/2)
      • repeat these steps for the calculation of y center
  5. Convert multiparts to singleparts.
  6. Export grid polygons as .csv file with the following settings.
    • format: comma separated values [CSV]
    • geometry: as wkt
    • separator: comma

4 – Import tasks to your pybossa project.

  • At you can upload your tasks using the Google Drive importer, the CSV URL importer or the Dropbox importer.
  • Just upload the resulting .csv file from your QGIS workflow.
  • Note: I would recommend the CSV URL uploader. Nevertheless, if you are uploading a lot of data that may take some time.

5 – Design your task presenter.

  • Edit the basic task presenter. I created some kind of template you can use. It’s available on Github.
  • Replace ‘missing_maps_follow_up’ with the shortname of your project at line 140 and line 216.
  • You may adjust the (default, minimum and maximum) zoom level to the scale of your task at lines 123, 124 and 125.
  • You may adjust the size of your map at line 30.
  • Finally, your task should look somehow similar to this.

6 – Find many contributors.

7 – Export your results.

  • You can export the input (tasks) and results (task_runs) of your application in .csv format.
  • The table tasks contains information on app_id, calibration, created, id, info, n_answers, priority_0, quorum, state, taskinfo__a, taskinfo__b, taskinfo__c, …
  • The column “info” in the table tasks contains the information you uploaded to your PyBossa project (e.g. WKT, x_center, y_center, id).
  • The table task_runs contains information on app_id, calibration, created, finish_time, id, info, task_id, timeout, user_id and user_ip.
  • The column “info” in the table task_runs contains the actual result of each microtask.

8 – Visualize your results.

  • To visualize your results on a map you need to merge the input (e.g. kivu_task.csv) and the results of the classification (e.g. kivu_task_run.csv) in QGIS.
  1.  Open task.csv file. (“Add delimited text layer”)
    • file format: CSV (comma seperated values)
    • record options: First record has field names
    • geometry definition: well known text (WKT)
  2.  Open task_run.csv file. (“Add delimited text layer”)
    • file format: CSV (comma seperated values)
    • record options: First record has field names
    • geometry definition: No geometry (attribute only table)
  3.  Join attributes.
    • Open properties of the task polygons layer.
    • Add vector join with the following settings.
      • join layer: task_run
      • join field: task_run__task_id
      • target field: task__id
      • choose which fields are joined: task_run__info
  4.  Save the join in a new shapefile (e.g. kivu_results.shp)
  5.  Additionally you may dissolve polygons with the same classification result into one geometry.
    • Open “Singlepart to Multipart”.
    • Unique id field: field with the task_run results
  6.  Style your map or upload your results (e.g. present the current state of your task on a uMap).


Have fun!

Additional notes:

  • If PyBossa runs on your own server you may import the tasks directly into postgresql. This is way faster than uploading your tasks using the interface.  Doing so you need to create a .csv file with the following columns:
    • id: unique number
    • created: insert “0”
    • app_id: id from your application
    • quorum: insert “0”
    • calibration: insert “0”
    • priority_0: insert “0”
    • info: most import field, here comes the information
      • style: {“key_1”: “value_1”, “key_2”: “value_2”, “key_3”: “value_3”}
    • n_answers: redundancy of the tasks
  • GitHub Repository: