Desktop Search Hackfest

Image:DesktopSearchHackfest.jpg

Hackfest around the Desktop Search and Metadata handling topic. Currently there are many partial solutions to the problem, so a hackfest could help to share some code, define lines to the projects and steal the good points of other projects :P

Contents

[edit] Proposed dates

  • After Maemo Summit 2008 (September 19-20th + weekend)
    • I would like to propose holding the actual hackfest on 20-21st Sept. Even though this gives a one-day overlap with the Maemo summit. This should also leave Monday as a scheduled wrap-up day.
  • In Berlin

[edit] Notes

[edit] Participants

  • Name (Project / Location)
  • Ben van Klinken (CLucene/NL) mailinglist
  • Jos van den Oever (Strigi/NL) mailinglist
  • Jamie McCracken (Tracker/Turks&Caicos,Carribean) mailinglist
  • Sebastian Trueg (Nepomuk/Germany) self covered trip + own accomondation mailinglist
  • Lukas Lipka (Beagle-Dashboard/Slovakia) mailinglist
  • Kevin Kubasik (Beagle/USA) mailinglist
  • Kevin Ottens (KDE Plasma search / France/Toulouse) mailinglist
  • Martyn Russel (Tracker/UK) covered trip + stay mailinglist
  • Carlos Garnacho (Tracker/Spain) covered trip + stay mailinglist
  • Philip van Hoof (Tracker/Belgium) covered trip + stay mailinglist
  • Urho Konttori (Tracker/Finland) covered trip + stay mailinglist
  • Ivan Frade (Tracker/Finland) covered trip + stay mailinglist
  • Mikael Ottela (Tracker/Finland) covered trip + stay mailinglist
  • Mikkel Kamstrup (Xesam/Denmark) mailinglist
  • Evgeny Egorochkin (Xesam/Ukraine) mailinglist
  • Sebastian Pölsterl (Deskbar-Applet/Germany) mailinglist
  • Ben Martin (libferris/Australia) mailinglist
  • Shivakumar Manishankar (Tracker tester)
  • Anders Rune Jensen (Nemo/Denmark) mailinglist?
  • Shivakumar Manishankar (Xesam/Tracker - Testing)
  • Rob Taylor (gnome)
  • Jürg Billeter (gnome)
  • Marius Vollmer (Nokia)
  • Otto Kopra (Nokia)

fabrice.colin (Pinot) was invited, is on mailinglist, but won't make it.

Michael Albinus (Xesam, Emacs, Debian) would like to attend the Ontology Workshop. He is on the Xesam mailing list on FDO. He should be notified ASAP when we know when the ontology workshop will be held. He lives in Berlin.

[edit] Expected Arrivals

VERY IMPORTANT - NEEDED FOR FINDING ACCOMMODATION

[edit] Thursday 18

Preferred day for arrivals, this way you have the whole Friday for warming up and attend the Maemo Summit.

  • Ben van Klinken (Haarlem/NL: any time - booking himself)
  • Urho Konttori (Helsinki 20:00+ - Urho group booking)
  • Ivan Frade (Helsinki 18:00+ - Urho group booking)
  • Philip van Hoof (Brussels/Eindhoven 18:00+ - Urho group booking)
  • Martyn Russell (London 18:00+ - Urho group booking)
  • Carlos Garnacho (Madrid 18:00+ - Urho group booking)
  • Mikael Ottela (Helsinki 18:00+ - Urho group booking)
  • Jamie McCracken (via New york 15:00+ - Urho group booking)
  • Sebastian Pölsterl (from Munich, no preferred time - booking himself)
  • Lukas Lipka (Bratislava 18:00+ - Quim group, booking requested to travel agent)
  • Jos van den Oever (from Hengelo, arrive at 17.08 - Quim group, booking requested to travel agent)
  • Kevin Ottens - Quim group, booking requested to travel agent
  • Evgeny Egorochkin (Xesam/Ukraine - Quim group, booking requested to travel agent)
  • Ben Martin (libferris/Australia - Quim group, booking requested to travel agent) (date to be confirmed)
  • Anders Rune Jensen (unknown time and location)
  • Shivakumar Manishankar (Helsinki 18:00+ - Urho group booking)
  • Rob Taylor (Manchester 18:00+ - Urho group booking )
  • Jürg Billeter (Zürich 18:00+ - Urho group booking )
  • Marius Vollmer (Helsinki 17:00+ - Own booking - own hotel)
  • Otto Kopra (Helsinki 17:00+ - Own booking - own hotel)

[edit] Friday 19

  • Kevin Kubasik (from SLC Int'l- Quim group, booking requested to travel agent) Can leave 9pm on thursday
  • Mikkel Kamstrup (Xesam/Denmark) (Copenhagen, 15:15)
  • Sebastian Trueg (Mandriva/Germany)

[edit] Saturday 20

[edit] Expected Departures

VERY IMPORTANT - NEEDED FOR FINDING ACCOMMODATION

[edit] Sunday 21

[edit] Monday 22

  • Urho Konttori
  • Ivan Frade
  • Philip van Hoof
  • Martyn Russell
  • Carlos Garnacho
  • Mikael Ottela
  • Sebastian Pölsterl (Returning to Munich between 18:00 and 22:00)
  • Kevin Kubasik (Returning to Washington Dulles)
  • Lukas Lipka
  • Ben van Klinken
  • Kevin Ottens
  • Mikkel Kamstrup (Xesam/Denmark) (Returning to Copenhagen, departure 20:55)
  • Anders Rune Jensen
  • Rob Taylor
  • Jürg Billeter
  • Marius Vollmer
  • Otto Kopra

[edit] Tuesday 23

Monday night is the last one covered by the Hackfest sponsorship. Feel free staying until Tuesday making the most of your trip.

  • Jamie McCracken
  • Jos van den Oever
  • Ben Martin
  • Evgeny Egorochkin (Xesam/Ukraine)
  • Ben Martin (libferris/Australia) (date to be confirmed)

[edit] Proposed items to work

The sessions are being ordered/organized in the Schedule page.

[edit] Concrete Coding Tasks

  • Xesam integration in file chooser and Nautilus. Possibly use xesam-glib
  • Specify a DBus specification for thumbnailers
  • Create language bindings for xesam-glib (specifically Vala, C#, and Python) for xesam-glib and use these achieve
    • a deskbar module
    • a Gnome Do add-in
    • Gnome launch box extension
      • Sebastian has mentioned that he would like to work on Python and C# bindings for xesam-glib as well as a Deskbar module
  • Create/draft a xesam-gtk library with widgets empowered by xesam-glib
  • Create a small server that exposes the Xesam search engine over Avahi (probably over http). This is correlated with the "Xesam over alternative protocols BOF".
  • Create a common metadata test corpus
  • Create a common metadata unit test suite for the most common file types
    • Improve current extractors (all engines). Prepare a huge database of creative commons contents with a lot of metadata, including corrupted files before the hack meeting. Prepare an automated script to check if we are extracting the interesting information from the files (showing information like "extraced the expected information in 90% of the mp3, incomplete information in 9%, crashed in 1% of them"). We can even organize a small competition with a modest prize.
  • Time ordering / searching optimized model for tracker
  • Category optimized database model for tracker
  • Document Xesam ontology

[edit] Beagle Coding Task Ideas

  • Beagle-Xesam work, we will have almost everyone involved in this spec available too us, so lets work on getting our compatibility 100%
    • Gnome-Do plugin. I want a solid beagle plugin for Gnome-Do, should be a super-quick codeup (Preferably over xesam!)
  • Dashboard. I have a million and one ideas working for this, but the core of it would hopefully be getting some unified means of generating clue packets/context. I am thinking that utilizing screenreader api's to get working windows/text could be very helpful here.
    • Work towards functional/usable/stable UI. Maybe even alpha demo?
  • Beagle Client Optimization. We still do the whole XML serializing and deserializing thing... Look into Dbus/Protocol Buffers to increase performance
    • This could increase the beagle-xesam performance considerably, and has the potential to fix Dashboard bottlenecks as well.

[edit] BOF Sessions

  • There are several metadata-heavy technologies emerging. Soylent, People, Online Desktop/Desktop Data Model, Xesam, and others. Can we somehow work more together? They all appear to take slightly different approaches.
  • How to share metadata between engines when it's not stored in the file itself (say... tags)
  • How to share extractors between the engines. (How many code to read id3 is out there?)
  • A shared way to harvest metadata and register metadata extractors or sources. This is also relevant for Xesam.
  • Dashboard? Why has the idea that everybody loved never landed on consumer desktops? How can we make it real. What technical solutions do we need in place?
  • While it is pretty hype to talk about desktop search and even write lots of code for it, why is it not more integrated in the desktop than it is? A big reason is of course the quality of the search engine. I can think of a lot of other reasons though (feel the teaser!).
  • Xesam over alternative protocols. Keywords: http/REST, Avahi, Bluetooth, XMLRPC, Soap, Plain ol' socket.
  • How can we integrate pervasive searching capabilities in the current Gnome desktop (ie. without changing the desktop interaction model)
  • How can we create a whole new user interface based on metadata and instant searches. Ie possibly breaking totally with the standard interaction model of the desktop. One possible starting point:
    • "do-what-I-think-desktop" The basic premise is "the user should not need to even touch the computer. It should just do the expected/desired in all circumstances without user interaction". Then see how far we can go with statistical analysis of historic user actions and rich metadata - and then accept that we can not achieve the end goal, but still get as close as possible.
  • Discuss the Xesam Metadata Storage spec. It is slated to be included in the post 1.0 release of Xesam, but there is very little concrete written down or agreed upon. This can seriously use a lot of discussion. It has ramifications into Soylent and desktop-data-model as well, probably others too.
  • Gnome and Nepomuk? Hitherto Gnome and Nepomuk has not really been related at all. Even though Xesam and Nepomuk has its disagreements we are also trying to collaborate. Should Gnome do more, what steps would be necessary to utilize Nepomuk technologies in Gnome?
  • Semantic Gnome?
  • Smarter Searching in Gnome: Keyword matching is cool, but user data is becoming more and more massive, Terabyte desktops are not unheard of. Thousands of e-mails and documents need a better ranking system.
  • Xesam Ontology review and discussion. This should be very, very, high priority if you ask me. Preferably lead by Evgeny. A very good follow-up to this would be to make it continue into an ontology-documentation hackparty --kamstrup 09:53, 3 August 2008 (UTC).
  • Xesam Roadmap to 1.0. Set deadline for 1.0. Outline items needing to be addressed. Who does what. Etc.
  • Xesam post 1.0, a BOF about Xesam's future. Xesam 1.1 features - fx. index metadata (term count), metadata storage api, see [[1]]. Xesam 2.0?
  • PagedSearch for xesam (Philip is interested in leading this BOF)

[edit] Meta

  • It would be great to have RC3 of the Xesam Search spec ready at least a week or two ahead of this. It is likely to contain some (minor) API-breaks. Probably an updated xesam-glib to go with it too.
  • Given an updated Xesam spec it would be great to have all servers updated to the latest spec and have easy-to-set-up trunks or branches. The point is that a hack fest should not be spent with everybody trying to set up a privately circulated branch of MyGreatSearchEngine.

[edit] Organization ToDo

Steps we need to make in order to bring this forward.

  • Coordination team
    • A list of people easy to CC and decide fast on pure organizational stuff e.g. money, people, places. Proposal: Urho Konttori (link with Maemo Summit organization), Vincent Untz (GNOME Foundation), Xesam rep, Tracker rep & Beagle rep. Ideally a local rep too but let's not wait for this to start moving forward.--qgil 08:13, 28 July 2008 (UTC)
      • I think you can add me as the Xesam rep. Unless someone else with a fairly neutral Xesam background want to step up? --kamstrup 22:31, 2 August 2008 (UTC)
  • Define dates
    • Proposal: September 19 & 20 WarmUp, people can attend to the Maemo Summit and start the discussions e.g. at a project level. 21 & 22 Official Desktop Search Hackfest. Some might want to have an After Hours (?).--qgil 08:13, 28 July 2008 (UTC)
    • It would be more convenient for me if it was 20-21 Sept. This of course overlaps with the Maemo summit. Then monday could be used as a more "official wrap up day" where people could collect the result and hack on those things that we agreed on during the "official" days. Also keeping a single WarmUp day should be fine I believe --kamstrup 22:31, 2 August 2008 (UTC)
  • Define people
    • I need the list of people to be sponsored asap to arrange travel and know for sure what is needed for accommodation.--qgil 08:13, 28 July 2008 (UTC)
  • Infrastructure required
    • One workspace room with wlan. Anything else? For instance, a projector or not? --qgil 08:13, 28 July 2008 (UTC)
      • Probably more than one workspace. Some people can work in xesam while other people hack/discuss other topics. --ifrade 13:49, 30 July 2008 (UTC)
  • Program
    • Flexible of course, but good to have confirmed in advanced a mission, achievable and specific objectives and a reference schedule.
      • I'll try to organize a list of goals and tasks. We can use that to check the success of the hackfest. --ifrade 13:49, 30 July 2008 (UTC)
      • We should make a list of people who are willing to be BOF-responsible, and what BOFs they would like to be responsible for --kamstrup 22:36, 2 August 2008 (UTC)
  • Marketing & press
    • The World needs to know, before and after! How?
      • At least blogging of the people there. Publicity in planet.gnome/planet.kde --ifrade 13:49, 30 July 2008 (UTC)

Anything else? Add it to the list.