Editing Desktop Search Hackfest/Day one notes

Warning: You are not logged in. Your IP address will be recorded in this page's edit history.
The edit can be undone. Please check the comparison below to verify that this is what you want to do, and then save the changes below to finish undoing the edit.
Latest revision Your text
Line 24: Line 24:
* tar, bz2, zip, rpm, deb pdf, ole2, mail, .a
* tar, bz2, zip, rpm, deb pdf, ole2, mail, .a
* jstream://
* jstream://
-
* Efficient for actually getting inner contents of objects without the need to create intermediate copy of the objects
+
Efficient for actually getting inner contents of objects without the need to create intermediate copy of the objects
* libstreamanalyzer
* libstreamanalyzer
* api for accessing metadata from streams
* api for accessing metadata from streams
Line 34: Line 34:
* analyzers also run on the embedded filescontent
* analyzers also run on the embedded filescontent
* plugins for the indexes, so in theory different indexes can be used
* plugins for the indexes, so in theory different indexes can be used
-
* most important clucene and soprano
+
most important clucene and soprano
* clucene very fast
* clucene very fast
* soprano used for nepomuk for semantic storage
* soprano used for nepomuk for semantic storage
-
* could use tracker as the index as well
+
could use tracker as the index as well
* libseachclient
* libseachclient
* socket access to strigidaemon
* socket access to strigidaemon
Line 70: Line 70:
* simple api was too simple - so, live was the way to go
* simple api was too simple - so, live was the way to go
* xesam
* xesam
-
* dbus search api
+
dbus search api
-
* ontology
+
ontology
-
* xml query language
+
xml query language
-
* user search language
+
user search language
* full draft online - devil in the details
* full draft online - devil in the details
* summarizing critic:
* summarizing critic:
-
* too complex onto
+
too complex onto
-
* feature creeping (it's etting too complex)
+
feature creeping (it's etting too complex)
-
* not extensible enough
+
not extensible enough
-
* vulnerability to extend and embrace tactics
+
vulnerability to extend and embrace tactics
Line 130: Line 130:
* and there are many other use cases of this type of relationships that will be too difficult to implement with the current xesam model efficiently
* and there are many other use cases of this type of relationships that will be too difficult to implement with the current xesam model efficiently
* photo annotations
* photo annotations
-
* defining region of a photo and who is there
+
defining region of a photo and who is there
-
* thus, the annotation is complex
+
thus, the annotation is complex
-
* example - photos of old people
+
example - photos of old people
-
* move toward semantic representation of data. FOF example of the photo annotations
+
move toward semantic representation of data. FOF example of the photo annotations
-
* semantic web activities going on in this area at the moment
+
semantic web activities going on in this area at the moment
-
* another example: papers referring it to other papers via identifiers without directly pointing to a file
+
another example: papers referring it to other papers via identifiers without directly pointing to a file
-
* point being that the client will have to be doing the work that really makes sense to be done on the backend side
+
point being that the client will have to be doing the work that really makes sense to be done on the backend side
-
* suggestion: use xesam (tm) if it correctly implements the core set of the functionality
+
suggestion: use xesam (tm) if it correctly implements the core set of the functionality
-
* we will need to be future proof and not stagnate and have people making hacks and workarounds
+
we will need to be future proof and not stagnate and have people making hacks and workarounds
-
* philip van hoof: standard will be successful only if the client developers take it to use  
+
philip van hoof: standard will be successful only if the client developers take it to use  
-
* martyn: it's an iterative process
+
martyn: it's an iterative process
-
* client developers are our users
+
client developers are our users
-
* using xesam needs to be easy and it needs to be able to provide easily the needed information
+
using xesam needs to be easy and it needs to be able to provide easily the needed information
-
* evgeny: how do we solve this conflict of interest: we need to define xesam in flexible enough way
+
evgeny: how do we solve this conflict of interest: we need to define xesam in flexible enough way
-
* evgeny: it is not simple enough for the users as it is  
+
evgeny: it is not simple enough for the users as it is  
-
* mikkel: two camps: semantic camp and field based camp, my opinion is more on the field based camp - thus, I don't see the needs of this type very relevant for our needs
+
mikkel: two camps: semantic camp and field based camp, my opinion is more on the field based camp - thus, I don't see the needs of this type very relevant for our needs
-
* kubasic: I agree that mutidimensional graphs not so interesting - maybe we could provide a convenience library to do this kind of depth relationships. So, perhaps xesam could just provide the field based stuff and let the wrappers worry about the more extended multidimensional use cases
+
kubasic: I agree that mutidimensional graphs not so interesting - maybe we could provide a convenience library to do this kind of depth relationships. So, perhaps xesam could just provide the field based stuff and let the wrappers worry about the more extended multidimensional use cases
-
* evgeny: my opinion would be to make it possible for the backend to be able to do this and provide the wrapper for the simple backends to do the field too graph use cases.  
+
evgeny: my opinion would be to make it possible for the backend to be able to do this and provide the wrapper for the simple backends to do the field too graph use cases.  
-
* maybe we could roadmap that only some parts would be available of the query language and ontology so that   
+
maybe we could roadmap that only some parts would be available of the query language and ontology so that   
-
* kubasic: if we do paged browsing the deeper relationships will be completely awful to be implement efficiently - so most often the deeper levels should not be so needed to be done
+
kubasic: if we do paged browsing the deeper relationships will be completely awful to be implement efficiently - so most often the deeper levels should not be so needed to be done
-
* evgeny: this need will either be implemented by the cleint or the server. use cases are there
+
evgeny: this need will either be implemented by the cleint or the server. use cases are there
-
* jos: can you provide a list of items to change
+
jos: can you provide a list of items to change
-
* evgeny: we need to decide whether we want to support these structures or not
+
evgeny: we need to decide whether we want to support these structures or not
-
* evgeny: can provide list of drawbacks
+
evgeny: can provide list of drawbacks
-
* current situation undecided what is the type of the e.g. author (reference as url or string of the author) - needs to be decided
+
current situation undecided what is the type of the e.g. author (reference as url or string of the author) - needs to be decided
-
* sebastian:  <missed>
+
‣ nepomuk guy:  <missed>
-
* jos: my suggestion - if I have author email address  - then we go to the next version, where we have contacts - I will store the contact url t database
+
jos: my suggestion - if I have author email address  - then we go to the next version, where we have contacts - I will store the contact url t database
-
* urho: suggesting inner queries to the query api
+
urho: suggesting inner queries to the query api
-
* evgeny: let me give more examples
+
evgeny: let me give more examples
* sparql - and current query language
* sparql - and current query language
*  
*  
* we can replace the current query language later on
* we can replace the current query language later on
-
* jamie: the mapping really doesn't exist on the actual contents, so the documents usually don't map the actual authors, but instead they just contain the freetext of the authors
+
jamie: the mapping really doesn't exist on the actual contents, so the documents usually don't map the actual authors, but instead they just contain the freetext of the authors
-
* so, as long as the applications do not contain the possibilities, the graphs won't happen
+
so, as long as the applications do not contain the possibilities, the graphs won't happen
-
* evgeny: we need to make it possible for the applications to do this, otherwise the applications wil lnever support it
+
evgeny: we need to make it possible for the applications to do this, otherwise the applications wil lnever support it
-
* sebastian: shouldn't we make the inner queries optional as well as the graph queries
+
‣ nepomuk guy: shouldn't we make the inner queries optional as well as the graph queries
* intermediate solution - flat view would map queries to text and on the graph capable implementations to the object url instead
* intermediate solution - flat view would map queries to text and on the graph capable implementations to the object url instead
* possiblity to create dummy items for each url and containing the string in that object
* possiblity to create dummy items for each url and containing the string in that object
-
* qubasic: this will be needing lot of more support from the backend will be needed and it will be much more complicated to do - would be more future compatible to be on the xesam
+
qubasic: this will be needing lot of more support from the backend will be needed and it will be much more complicated to do - would be more future compatible to be on the xesam
-
* sebastian: internal stuff could create the internal representations in hackish way and just expose the data in graph way
+
‣ nepomuk guy: internal stuff could create the internal representations in hackish way and just expose the data in graph way
-
* evgeny: proposing a far reaching goal to make it easy for users in mid term future to transition to this - maybe roadmat for applications to be able to produce more complex queries
+
evgeny: proposing a far reaching goal to make it easy for users in mid term future to transition to this - maybe roadmat for applications to be able to produce more complex queries
-
* Urho: so, do you propose SPARQL?
+
Urho: so, do you propose SPARQL?
-
* evgeny: nope - we should be able to extend current XML
+
evgeny: nope - we should be able to extend current XML
-
* sebastian: we could create related to inner queries
+
‣ nepomuk guy: we could create related to inner queries
-
* mikkel: why not jus tallow the results of one query to be source for the next one
+
mikkel: why not jus tallow the results of one query to be source for the next one
-
* urho: so inner queries - yeah
+
urho: so inner queries - yeah
-
* evgeny: makes an example of the paper question in sparql
+
evgeny: makes an example of the paper question in sparql
-
* evgeny: we are not looking for using sparql, just similar functionality
+
evgeny: we are not looking for using sparql, just similar functionality
-
* evgeny: mapping sparql to sql can be done
+
evgeny: mapping sparql to sql can be done
-
* jos: I propose that we won't implement this to 1.0
+
jos: I propose that we won't implement this to 1.0
* just make the ontology as future proof
* just make the ontology as future proof
* make a temporary solution and future path
* make a temporary solution and future path
-
* evgeny: list of use cases
+
evgeny: list of use cases
* we have no way to query - semantic desktop guys will not be able to use xesam in the way they can use nepomuk
* we have no way to query - semantic desktop guys will not be able to use xesam in the way they can use nepomuk
* maybe we could at least make a wrapper library?
* maybe we could at least make a wrapper library?
-
* JOS: making the proposal
+
JOS: making the proposal
* usually we store the name of author in a field on document
* usually we store the name of author in a field on document
* in graph we would store it on separate object
* in graph we would store it on separate object
* so, propose that we do it like this now and if engine supports the denormalized model, it will do the conversion from the object to the name
* so, propose that we do it like this now and if engine supports the denormalized model, it will do the conversion from the object to the name
* and we extend the ontology the denormalized way
* and we extend the ontology the denormalized way
-
* sebastian: we have the proper ontology in nepomuk, we could just translate that to xesam
+
* nepomuk guy: we have the proper ontology in nepomuk, we could just translate that to xesam
* qubasik: suggest that we make a small team to discuss this in real detail
* qubasik: suggest that we make a small team to discuss this in real detail
* agreed: 1st item: FAIL -  
* agreed: 1st item: FAIL -  
Line 217: Line 217:
* kubasik: we should decide whether to use the properties on the xml or on the dbus
* kubasik: we should decide whether to use the properties on the xml or on the dbus
* mikkel: properties can go on the search object instead (written)
* mikkel: properties can go on the search object instead (written)
-
* philip: I agree with sebastian (sebastian) hit fields should be in the query
+
* philip: I agree with sebastian (nepomuk guy) hit fields should be in the query
* philip: we can go instead for the session changes to be for the xesam 2.0 and not yet to the 1.0
* philip: we can go instead for the session changes to be for the xesam 2.0 and not yet to the 1.0
* philip: if we change the query language, that'll break everything anyway
* philip: if we change the query language, that'll break everything anyway
Line 305: Line 305:
* kubasik: we would need to do nested queries most probably and not probably an optimized index for this
* kubasik: we would need to do nested queries most probably and not probably an optimized index for this
* sebastian: normally you would say give me all emails htat have sender of contact object that has name like this
* sebastian: normally you would say give me all emails htat have sender of contact object that has name like this
-
* now you could do it sender as a range, of person object, it would check all xesam id values in the relation
+
now you could do it sender as a range, of person object, it would check all xesam id values in the relation
* mikkel: it shoud be very easy to do these queries
* mikkel: it shoud be very easy to do these queries
-
* and it should be very easy to show the values of these fields
+
and it should be very easy to show the values of these fields
* jos: we should do so that we do two fields: radable valua and pointer/relation property. Also, the more strong engines could say that the value is just an alias of the actual objects relations property
* jos: we should do so that we do two fields: radable valua and pointer/relation property. Also, the more strong engines could say that the value is just an alias of the actual objects relations property
* sebastian: why not use the uri for data representation plugins that would then be able to create the visible widgets of the data items
* sebastian: why not use the uri for data representation plugins that would then be able to create the visible widgets of the data items
Line 334: Line 334:
* <lots of discussion>
* <lots of discussion>
* Conclusion:
* Conclusion:
-
* We make some older strings as URIs instead of the previous string values
+
We make some older strings as URIs instead of the previous string values
-
* All new URI properties also have a .label that allows to get the textual representation of the property, which in essence is the old xesam ontology value of the property. The ideal is that this is a pointer to a specific property of the linked object
+
All new URI properties also have a .label that allows to get the textual representation of the property, which in essence is the old xesam ontology value of the property. The ideal is that this is a pointer to a specific property of the linked object
-
* We will draft nested queries and they will most probably go to 1.1
+
We will draft nested queries and they will most probably go to 1.1
*
*
-
 
-
 
-
[[Category:Community]]
 
-
[[Category:Development]]
 

Learn more about Contributing to the wiki.


Please note that all contributions to maemo.org wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see maemo.org wiki:Copyrights for details). Do not submit copyrighted work without permission!


Cancel | Editing help (opens in new window)