[Sugar-devel] On datastore object IDs

Eben Eliason eben at laptop.org
Thu Jul 2 14:44:25 EDT 2009


On Thu, Jul 2, 2009 at 2:21 PM, Benjamin M.
Schwartz<bmschwar at fas.harvard.edu> wrote:
> Eben Eliason wrote:
>> On Thu, Jul 2, 2009 at 1:40 PM, Benjamin M.
>> Schwartz<bmschwar at fas.harvard.edu> wrote:
>>> Eben Eliason wrote:
>>>> Hmm, I think that only objects have titles and user editable metadata
>>>> (tags/description, etc.). If I open Write, and name that Document in
>>>> the usual way, that title should be associated with that object. The
>>>> action will happen to read like:
>>>>
>>>> Wrote [My Story]
>>>>
>>>> And clicking on the icon next to "My Story" in the action should
>>>> resume the activity, but the name is actually belonging to the object
>>>> that the action refers to.
>>> I think sessions can have names too.  If I start an instance of the Chess
>>> activity and share it with the name "Bedford Middle School Chess Club
>>> Summer Tournament", then that's the name of the entire session, even if it
>>> produces multiple Documents or no Documents.
>>
>> Well, I think that the session is defined by the name provided in the
>> current name field, regardless of whether or not a Document is
>> created.
>
> I agree.

But I also think that that name will apply to the Document itself,
most of the time. When there's a 1-1 this is natural. When the
document produces more than one Document, as in Record, we both agree
there should be some representation of the "session" which is
resumable. For some activities, this session might actually have a
blob of data; in others it may not.

>> In write, that name would refer to the Document object. In
>> most activities, this name will map to the object. In Record, the name
>> refers to the "roll of film", which is why I thought there might be a
>> "session object" in some cases.
>
> I am suggesting that there _always_ be a session object, and that this
> object be the Action.  This object may or may not have an associated blob.

So I guess we both agree that the default single-title naming scheme
currently employed in the UI is fine as is. We just need to figure out
exactly how that maps onto actions and objects, and how session
objects are stored and represented. I was under the assumption that
this Action object in the DS would be managed solely by the Journal,
in which case an activity that wanted to store a blob for its session
would do this in a separate object.

So taking record again, and assuming that the "roll of film" had a
file format of its own, would we have:
[action object], [roll of film object], [[photo 1], ...]

or just:
[action object], [[photo 1], ...]

?

>> I don't think it makes sense to name actions independently of their
>> activity sessions/objects.
>
> I am suggesting a model in which the Action of using an Activity is
> represented in the datastore by an object representing the session.

Right. So the question, I guess, is whether or not that session object
is managed by the activity, and if it contains a blob.

>>>>>  When I share an object
>>>>> with you, it may get a new tree_id and a version_id of 1, but it keeps all
>>>>> its metadata, including its title.
>>>> This seem fundamentally wrong to me, but perhaps that's because I'm
>>>> not actually seeing how these 2 problems you bring up are problems.
>>> What seems wrong? What problems?
>>
>> I thought the two items you had brought previously up were problems
>> with the idea of associating a single object with multiple actions.
>
> In this particular case, I'm referring to the case of pushing a Document
> to a friend over the network.  I am suggesting that the Document arrives
> without any version history, and without any of its Action associations,
> and so gets a new Action (whose title is 'Transferred "$NAME" from
> "$SENDER"') a new tree_id, and a new version_id ('1').

Hmmm, I see. Well, I'm not sure which way I like this better. I agree
we send off the object without the history and action associations,
and basically lives as the root of a new tree (new tree_id), and
associated with an "Received [object] from [friend]" action. It's
unclear to me that the person who sent this should also create a new
object with a new tree_id. I think not, actually. I'd expect to see a
reference to the thing it was I sent them. This is because, for
instance, I might resume that object from the "Sent [object] to
[friend]" action expecting to continue working on it, simply because
the fact that I sent it to them recently was the easiest way for me to
find it. I wouldn't expect to be in a new history with new metadata in
this instance.

>>>> Why can't a given Document (tree_id, version_id) be referenced from
>>>> any number of actions?
>>> It could be.  It just seemed simpler to disallow it.
>>>
>>> If editing a Document's metadata produces a new Document, as befitting our
>>> Copy-on-Write model of versioning, then the process of editing the
>>> "associated_actions" field produces a new version.  Therefore, every time
>>> an Action adds a Document to itself, the process of adding the
>>> back-reference would produce a new version of the Document, so only one
>>> Action would ever end up referring to one version of the a Document.
>>
>> Metadata is attached to the version, I believe. I don't think we
>> should be versioning metadata, so I don't think that it makes sense to
>> create a new version when changing the metadata.
>
> I don't see such a big distinction between the data and metadata.  In
> fact, Activities whose state is easily represented as key:value pairs can
> put their entire state into the metadata, instead of storing it in a blob.

Hmm, I do see a distinction, actually. Though Perhaps it depends on
the the type. As an example:

1. I make an image.
2. I make several changes to this image over time, resulting in new versions.
3. I decide that one of these intermediate images was meaningful in
some way, and desire to tag it accordingly.

I definitely don't want changing the description, or the tags, on some
previous version to inadvertently a) make a new version and b) make
that new version the most recent (and therefore most exposed) version.

Perhaps we need to bite the bullet and consider having both versioned
and unversioned metadata...

>>> If editing a Document's metadata doesn't produce a new Document, then we
>>> have a hilarious race condition in which  two Activities, both referencing
>>> the same Document, edit the "associated_actions" field at the same time,
>>> and one of them ends up getting dropped, producing a corrupted datastore.
>>
>> This seems like the most plausible case. Really, though, it should be
>> the Journal (or DS) that's responsible for setting up all of these
>> references, and not the activities. If activities want to destroy
>> metadata, they can destroy metadata. But I don't think the race
>> condition exists if the activities aren't expected to make the
>> references themselves. Or, can we put a mutex wrapper around metadata
>> changes?
>
> It sounds like you want case 3:
>
>>> If back-references aren't stored in the Document's metadata, then either

The hypothetical in case 3 is that we don't store the info in the
metadata. I'm suggesting that we still store it in metadata, but that
the activity itself isn't the one in charge of handling it. It
shouldn't have to.

The side note about putting locks on metadata changes seems like a
generally useful thing to do for robustness, which is why I brought it
up.

Eben


>>> the datastore has to maintain an inverted index of the references in the
>>> Actions, or it has to perform an enormously expensive search to find which
>>> Actions are associated with a given Document.  This adds complexity.
>
> --Ben
>
>


More information about the Sugar-devel mailing list