Thursday, June 23, 2011

Pet project: Metadata Flask

Just a quick announce of a pet project I started a few months ago. Unfortunately I've already slowed down work to a halt as I focus on finishing my MBA dissertation.

The objective of the Metadata Flask project is to provide tools for supporting the process of finding, collaborating and working with metadata on an open and linked-data web.

Nowadays when players of an industry vertical, say: commodities producers and large scale retailers like Walmart, or Oil & Gas equipment producers and Oil companies/operators want to define and evolve data exchange formats they have to build their own infrastructure for documenting, referencing and hosting data exchange definitions.

These players frequently need to exchange a lot of data for describing what is produced, their specs, how they are sold, who wants to buy etc. So they come up with metadata definitions like WSDL contracts, XML schemas, CSV/JSON templates and so on.

I believe their work would be a lot easier if infrastructure for hosting, exchanging, discovering, collaborative editing, moderating and many other metadata related processes was already in place.

The goal is ambitious, the problem not well defined and "tractable" right now, so I started investigating what open data sets are available out there. What I realized is that many of them couldn't be found using general purpose search engines, so I started a first subproject, the Open Data Directory, which is a search engine for open data sets published by governments, private companies and other organizations. It already indexes 360K+ datasets from many sources, and traffic is slowly increasing as search engine bots index its contents.

I wish Amazon would innovate faster on it's Kindle OS


Amazon seems to be really slow and conservative on features for the embedded software (OS) running on it's Kindle devices. The last update (v3.1) was on Feb 2011. I don't see why they can't let users opt-in for cutting-edge feature experiments. This way they could easily roll-out wireless beta updates to advanced users and experiment with feature designs/variations, collecting usage data on what works/doesn't.

Some areas they could've been experimenting publicly for ages:
  1. Social features: why stop at public note sharing via twitter/facebook? Why not integrating to services like Evernote, Instapaper, Delicious? Why can't it sync my notes to Amazon's website? It would be nice to be able to review them online. 
  2. Notes/highlights usability: The reading experience should be at least as powerful as the Real Thing. Why can't I attach notes to whole chapters or even books? Why restrict it to contiguous blocks of text? After all, the whole reading experience is mostly not about what I read, but what I thought about it, what I learned from it, how it applies to me, and people use notes for remembering these things.
  3. Podcasts: it has enough storage and CPU for playing mp3s and downloading data on the background. Why not scheduling the download and managing podcasts?
  4. Improve the experience for reading blogs. Why not integrating to Google Reader?