Parltrack needs money to keep on turning PDFs and DOCs into usable data

Stefan writes, "Parltrack is free software that liberates a lot of hard to process data (like PDFs, Word docs, and HTML pages) as reusable open data and presents this as a kind of dashboard for activists, providing fresh and relevant data not only for the concerned but the curious citizen as well. Even pros from the European Parliament have praised it. Parltrack is free software, for further development it needs a few more backers in its crowdsourcing campaign."

6 Responses to “Parltrack needs money to keep on turning PDFs and DOCs into usable data”

  1. Vinnie Tesla says:

    If HTML isn’t an open format, what on earth is??

    • luxliquidus says:

      Read it more carefully… “hard to process data” to “reusable open data”.  Not all open data is easy to process or easily reusable.

    • noah says:

      I think this is a scraper of some sort; it pulls numbers (“data”) out of the various formatted documents.

      • parltrack says:

        hey, I’m the author. Besides Parltrack being a scraper it is also curated, aggregated and presented in a way most useful for active citizens. The liberated data is not so much numerical as more procedural, deadlines for tabling amendments, and the amendments themselves are prominent examples of parltrack liberated data.

  2. nosehat says:

    This would appear to be the github for the code, if you would rather contribute that way.

    • parltrack says:

      Correct. The code is a bit of an ode to the FSM in some places and would need a lot of love and refactoring. But that is mostly due to the whole thing trying to be very tolerant towards data fed into it, lot’s of edge cases and other exceptions are handled. If you want to develop something quickly the daily dumped db is an easy entry point.

Leave a Reply