Desktop Search Hackfest

See also the discussion page.

Contents

Event description

[COMPLETE]

Hackfest around the Desktop Search and Metadata handling topic. Currently there are many partial solutions to the problem, so a hackfest could help to share some code, define lines to the projects and steal the good points of other projects :P

Proposed dates

Participants

  • Name (Project / Location)
  • Ben van Klinken (CLucene/NL) mailinglist
  • Jos van den Oever (Strigi/NL) mailinglist
  • Jamie McCracken (Tracker/Turks&Caicos,Carribean) mailinglist
  • Sebastian Trueg (Nepomuk/Germany) self covered trip + own accomondation mailinglist
  • Lukas Lipka (Beagle-Dashboard/Slovakia) mailinglist
  • Kevin Kubasik (Beagle/USA) mailinglist
  • Kevin Ottens (KDE Plasma search / France/Toulouse) mailinglist
  • Jerry Tan (Tracker) mailinglist
  • Martyn Russel (Tracker/UK) covered trip + stay mailinglist
  • Carlos Garnacho (Tracker/Spain) covered trip + stay mailinglist
  • Philip van Hoof (Tracker/Belgium) covered trip + stay mailinglist
  • Urho Konttori (Tracker/Finland) covered trip + stay mailinglist
  • Ivan Frade (Tracker/Finland) covered trip + stay mailinglist
  • Mikael Ottela (Tracker/Finland) covered trip + stay mailinglist
  • Mikkel Kamstrup (Xesam/Denmark) mailinglist
  • Evgeny Egorochkin (Xesam/Ukraine) mailinglist
  • Sebastian Pölsterl (Deskbar-Applet/Germany) mailinglist
  • Ben Martin (libferris/Australia) mailinglist

fabrice.colin was invited, is on mailinglist, but won't make it.

Expected Arrivals

VERY IMPORTANT - NEEDED FOR FINDING ACCOMMODATION

Thursday 18

Preferred day for arrivals, this way you have the whole Friday for warming up and attend the Maemo Summit.

  • Ben van Klinken (Haarlem/NL: any time)
  • Urho Konttori (Helsinki 20:00+)
  • Ivan Frade (Helsinki 18:00+)
  • Philip van Hoof (Brussels/Eindhoven 18:00+)
  • Martyn Russell (London 18:00+)
  • Carlos Garnacho (Madrid 18:00+)
  • Mikael Ottela (Helsinki 18:00+)
  • Jamie McCracken (via New york 15:00+)
  • Sebastian Pölsterl (from Munich, no preferred time)
  • Lukas Lipka (Bratislava 18:00+)
  • Jos van den Oever (from Hengelo, arrive at 17.08)
  • Ben Martin
  • Kevin Ottens
  • Evgeny Egorochkin (Xesam/Ukraine)

Friday 19

  • Kevin Kubasik (from SLC Int'l)

Saturday 20

Expected Departures

VERY IMPORTANT - NEEDED FOR FINDING ACCOMMODATION

Sunday 21

Monday 22

  • Urho Konttori
  • Ivan Frade
  • Philip van Hoof
  • Martyn Russell
  • Carlos Garnacho
  • Mikael Ottela
  • Sebastian Pölsterl (Returning to Munich between 18:00 and 22:00)
  • Kevin Kubasik (Returning to Washington Dulles)
  • Lukas Lipka
  • Ben van Klinken
  • Kevin Ottens

Tuesday 23

Monday night is the last one covered by the Hackfest sponsorship. Feel free staying until Tuesday making the most of your trip.

  • Jamie McCracken
  • Jos van den Oever
  • Ben Martin
  • Evgeny Egorochkin (Xesam/Ukraine)

Proposed items to work

Concrete Coding Tasks

  • Xesam integration in file chooser and Nautilus. Possibly use xesam-glib
  • Create language bindings for xesam-glib (specifically Vala, C#, and Python) for xesam-glib and use these achieve
    • a deskbar module
    • a Gnome Do add-in
    • Gnome launch box extension
  • Create/draft a xesam-gtk library with widgets empowered by xesam-glib
  • Create a small server that exposes the Xesam search engine over Avahi (probably over http). This is correlated with the second point under BOFs.
  • Create a common metadata test corpus
  • Create a common metadata unit test suite for the most common file types
    • Improve current extractors (all engines). Prepare a huge database of creative commons contents with a lot of metadata, including corrupted files before the hack meeting. Prepare an automated script to check if we are extracting the interesting information from the files (showing information like "extraced the expected information in 90% of the mp3, incomplete information in 9%, crashed in 1% of them"). We can even organize a small competition with a modest prize.
  • Time ordering / searching optimized model for tracker
  • Category optimized database model for tracker

Beagle Coding Task Ideas

  • Beagle-Xeasm work, we will have almost everyone involved in this spec available too us, so lets work on getting our compatibility 100%
    • Gnome-Do plugin. I want a solid beagle plugin for Gnome-Do, should be a super-quick codeup (Preferably over xesam!)
  • Dashboard. I have a million and one ideas working for this, but the core of it would hopefully be getting some unified means of generating clue packets/context. I am thinking that utilizing screenreader api's to get working windows/text could be very helpful here.
    • Work towards functional/usable/stable UI. Maybe even alpha demo?
  • Beagle Client Optimization. We still do the whole XML serializing and deserializing thing... Look into Dbus/Protocol Buffers to increase performance
    • This could increase the beagle-xesam performance considerably, and has the potential to fix Dashboard bottlenecks as well.

BOF Sessions

  • There are several metadata-heavy technologies emerging. Soylent, People, Online Desktop/Desktop Data Model, Xesam, and others. Can we somehow work more together? They all appear to take slightly different approaches.
  • How to share metadata between engines when it's not stored in the file itself (say... tags)
  • How to share extractors between the engines. (How many code to read id3 is out there?)
  • A shared way to harvest metadata and register metadata extractors or sources. This is also relevant for Xesam.
  • Dashboard? Why has the idea that everybody loved never landed on consumer desktops? How can we make it real. What technical solutions do we need in place?
  • While it is pretty hype to talk about desktop search and even write lots of code for it, why is it not more integrated in the desktop than it is? A big reason is of course the quality of the search engine. I can think of a lot of other reasons though (feel the teaser!).
  • Xesam over alternative protocols. Keywords: http/REST, Avahi, Bluetooth, XMLRPC, Soap, Plain ol' socket.
  • How can we integrate pervasive searching capabilities in the current Gnome desktop (ie. without changing the desktop interaction model)
  • How can we create a whole new user interface based on metadata and instant searches. Ie possibly breaking totally with the standard interaction model of the desktop. One possible starting point:
    • "do-what-I-think-desktop" The basic premise is "the user should not need to even touch the computer. It should just do the expected/desired in all circumstances without user interaction". Then see how far we can go with statistical analysis of historic user actions and rich metadata - and then accept that we can not achieve the end goal, but still get as close as possible.


  • Discuss the Xesam Metadata Storage spec. It is slated to be included in the post 1.0 release of Xesam, but there is very little concrete written down or agreed upon. This can seriously use a lot of discussion. It has ramifications into Soylent and desktop-data-model as well, probably others too.
  • Gnome and Nepomuk? Hitherto Gnome and Nepomuk has not really been related at all. Even though Xesam and Nepomuk has its disagreements we are also trying to collaborate. Should Gnome do more, what steps would be necessary to utilize Nepomuk technologies in Gnome?
  • Semantic Gnome?
  • Smarter Searching in Gnome: Keyword matching is cool, but user data is becoming more and more massive, Terabyte desktops are not unheard of. Thousands of e-mails and documents need a better ranking system.

Meta

  • It would be great to have RC3 of the Xesam Search spec ready at least a week or two ahead of this. It is likely to contain some (minor) API-breaks. Probably an updated xesam-glib to go with it too.
  • Given an updated Xesam spec it would be great to have all servers updated to the latest spec and have easy-to-set-up trunks or branches. The point is that a hack fest should not be spent with everybody trying to set up a privately circulated branch of MyGreatSearchEngine.

Organization ToDo

Steps we need to make in order to bring this forward.

  • Coordination team
    • A list of people easy to CC and decide fast on pure organizational stuff e.g. money, people, places. Proposal: Urho Konttori (link with Maemo Summit organization), Vincent Untz (GNOME Foundation), Xesam rep, Tracker rep & Beagle rep. Ideally a local rep too but let's not wait for this to start moving forward.--qgil 08:13, 28 July 2008 (UTC)
  • Define dates
    • Proposal: September 19 & 20 WarmUp, people can attend to the Maemo Summit and start the discussions e.g. at a project level. 21 & 22 Official Desktop Search Hackfest. Some might want to have an After Hours (?).--qgil 08:13, 28 July 2008 (UTC)
  • Define people
    • I need the list of people to be sponsored asap to arrange travel and know for sure what is needed for accommodation.--qgil 08:13, 28 July 2008 (UTC)
  • Infrastructure required
    • One workspace room with wlan. Anything else? For instance, a projector or not? --qgil 08:13, 28 July 2008 (UTC)
      • Probably more than one workspace. Some people can work in xesam while other people hack/discuss other topics. --ifrade 13:49, 30 July 2008 (UTC)
  • Program
    • Flexible of course, but good to have confirmed in advanced a mission, achievable and specific objectives and a reference schedule.
      • I'll try to organize a list of goals and tasks. We can use that to check the success of the hackfest. --ifrade 13:49, 30 July 2008 (UTC)
  • Marketing & press
    • The World needs to know, before and after! How?
      • At least blogging of the people there. Publicity in planet.gnome/planet.kde --ifrade 13:49, 30 July 2008 (UTC)

Anything else? Add it to the list.