Revisiting the War on RDF, especially folder lookup

Kent James kent at caspia.com
Tue Jan 22 19:51:14 UTC 2013


As it turns out, just last week I wrote a bit of code surrounding the 
Archive functionality for ExQuilla, which faces many of the issues that 
you describe here. So this topic is very familiar to me at the moment 
(and just another example of the eerie history that rkent and jcranmer 
are almost always working on the same issues at the same time). I'll 
make some comments up front, and some comments inline.

There are several general classes of issues that I have faced many times 
in working with the folder code. Let me mention them first.

1) Confusion between the creation of the folder object, and the folder 
storage - and the related question of who does initializations. Yo 
already mentioned the initial startup as an issue which is related to this.

2) Handling the URIs of special folders. We have a number of folders 
that have special URIs, like "trash" or "junk". Keeping those straight 
is tricky. It would be good if the handling of those was centralized. 
(This is your "normalized to specific case-sensitive variants").

3) Mixed sync/async models. Since many folder operations are sync for 
local folders and async for IMAP, much of the code has parallel paths 
depending on whether a folder is async or not (see for example 
processNextBatch in mailWIndowOverlay.js, also see bug 832034 
<https://bugzilla.mozilla.org/show_bug.cgi?id=832034>). A related issue 
is the type of async handling that this implies. The most common model 
in mailnews code is to listen for a folder event, like 
nsIMsgFolderListener.folderAdded.  But this has no error handling 
builtin, and there is no reliable method that always gets called 
regardless of the result. In my own code, I usually end up creating a 
class like CreateSubfolderWithListener that has a reliable listener that 
is guaranteed to be called, and serves as a reliable termination point 
for async processing.

I would propose that calls that may be async be replaced by calls that 
are always async, or that return whether async processing is needed - 
and the async termination  (such as nsIURLListener.OnStopRunningURL) is 
guaranteed to be called after the return from the main routine.

On 1/21/2013 2:11 PM, Joshua Cranmer wrote:
> ...
> There is a single bug we could fix that would eliminate most of our 
> (apparent) use of RDF: bug 441437, the folder lookup service. All 
> folders are associated with a unique URI which can be used to get a 
> folder by calling nsIRDFService::GetResource. That function is 
> extremely magical: the RDF service maintains a weak map of all RDF 
> resources (including folders), and will lazily create them if it's not 
> in its map. This creation is magical, but it is how resources are 
> expected to be created. The problem is you can't tell it to not create 
> a folder when you don't want to, and there are times when calls are 
> used by Thunderbird code in expectation of a lazy creation.
>
> The folder lookup service, as originally proposed, eagerly built the 
> map of all folders by recursively iterating all folders in the tree. 
> This turns out to be a problem since iterating folders in IMAP can 
> result in actual folder creation, which caused a suite of bugs that 
> forced the original implementation to be backed out shortly after it 
> first landed. An alternative proposed implementation amounts to a lazy 
> map creation, but this starts to run into some issues due to 
> ill-defined semantics, as I'll note below.
Amen to the issue of ill defined semantics! The word "folder" in this 
paragraph already could mean three different things: 1) the nsIMsgFolder 
XPCOM object, 2) the folder storage on the server, or 3) the reflection 
of the folder storage in a local nsIMsgDatabase file implementation.
>
> To properly fix the bug, I think we need to sit down and discuss a 
> topic that has been surprisingly evasive in our current documentation: 
> how our backend server/folder infrastructure works. With respect to 
> the folder lookup service, the important questions that need consensus 
> answer are as follows:
>
> 1. Who creates folder objects? What API calls are necessary to do it?
> 2. What process causes the creation of folder objects?
> 3. What semantics to folder URIs are necessary to support the above 
> responses?
>
>
> The answers as currently applies to the codebase are as follows:
> 1. The RDF service does it. Folder creation and initialization happens 
> via specific contracts and nsIRDFResource::Init.
> 2. This is created on first use of a folder. Even if the folder 
> doesn't exist.
> 3. There is a very inherent assumption that folders are '/' delimited, 
> and the root folder's URI is identical to the incoming server's URI. 
> In addition, some components are expected (in 
> nsMsgDBFolder::AddSubfolder) to be normalized to specific 
> case-sensitive variants. And don't get me started on which methods 
> expect which URIs to be percent-encoded.
>
>
> I think the most tenable option for a folder-lookup service that 
> replaces the current mess is a lazy one, which looks like this:
>
> getFolderForURI(folderURI, forceCreation) {
>   if (folderURI in map)
>     return map[folderURI];
>   else {
>     folder = createFolderObject(folderURI, forceCreation);
>     if (folder) map[folderURI] = folder;
>     return folder;
>   }
> }
You need to call this "getFolderObjectForURI" to keep the semantics 
straight.
>
> (Note that "forceCreation" here has the semantics "if the folder 
> doesn't physically exist, then make it happen" with the expectation 
> that it will default to false for most people.
I think that you are already introducing confusion here between the 
"folderObject" and the "folderStorage". I really don't think you should 
mix those in the way that you are proposing here. "forceCreation" should 
refer to the folderObject only, so that getFolderObjectForURI can safely 
be sync. Follow that call with nsIMsgFolder::CreateStorageIfMissing (an 
async operation) if you need the storage.

Just as a concrete example, they way that I ended up doing the archive 
function with folder location preservation was as follows.

1) Generate the correct URI for the archive folder.
2) Create the folderObject using RDF::GetResource
3) Call a recursive nsIMsgFolder::CreateStorageIfMissing(listener) on 
the folder object, recursive in that it creates its parent if the parent 
is also missing.

In the calls that you have proposed, there is no way to create a 
folderObject without storage to then call CreateStorageIfMissing from. 
You have also created one of these "sometimes async" calls which I think 
is part of the current confusion.

> There are one or two places where we actively desire those semantics.) 
> Assuming no one has any objections to this implementation strategy, 
> the only hard part is deciding what "createFolderObject" looks like. 
> One way that reuses current APIs is effectively a "compute parent 
> folder and do parent.addSubfolder(...)" call on it, but that loses the 
> forceCreation semantics. In such a scenario, it appears that the best 
> course of action is to create a new API on nsIMsgFolder or 
> nsIMsgIncomingServer that creates the folder, which would retain both 
> parameters.
>
> Thus, the API proposal which I think makes the most sense is the 
> following:
> interface nsIFolderLookupService {
>   nsIMsgFolder getFolderForURI(in AUTF8String folderURI, [optional] in 
> bool forceCreation);
> }
> interface nsIMsgIncomingServer {
>   nsIMsgFolder makeFolder(in AUTF8String folderURI, in bool 
> createIfNotExists);
> }
My alternate proposal would be:

interface nsIMsgIncomingServer {
   nsIMsgFolder getFolderObjectForURI(in AUTF8String folderURI, 
[optional] in bool forceCreation);
}

and use the existing nsIMsgFolder.createStorageIfMissing(in 
nsIUrlListener) method (though possibly with a bool return to indicate 
that an async process was started to create the folder storage). But 
note that forceCreation simply means create the folderObject, not the 
folderStorage or its reflection in a nsIMsgDatabase file. That way 
getFolderObjectForURI is always sync. (I'm not sure that the existing 
imap createStorageIfMissing will recursively create missing parents).

>
> Using this as a baseline, the answers to the questions become this:
> 1. Folder objects are created by the server, via the API above.
> 2. Folder objects are created on first-use by the folder lookup service.
Now you've confused me, as you've given two different ways that "folder 
objects" are created. I really think that you need to separate the 
concerns with initialization ("on first use") from that of having a 
clear set of protocols to use for folder object, folder storage, and 
folder database creation.
> 3. The authority component of the URI is the URI of the server object; 
> the filepath attributes are left to be interpreted by server 
> implementations as they choose.
This works if createStorageIfMissing is recursive, if not then the 
archive create has to proceed forwards from the server, so you have to 
assume that the url structure is server/topFolder/subfolder1/subfolder2 
structure.
>
> Thoughts/comments/questions/concerns/flamewars/feedback?
I guess I am proposing that the folder lookup service is instead a 
server lookup service which would be an nsIMsgMessageService method. 
Servers are hard references owned by the message service. Then the 
server would lookup the folder, caching them as soft links. There would 
also be utility methods that already exist, such as the js 
getFolderForURI and the C++ GetExistingFolder.

I am punting the issue of initialization which is a whole nuther topic. 
We also need to deal with the questions of opening the message database. 
It would be fairly rare to get the folder without also wanting to get 
the database. Perhaps we need an async method to open a folder from the 
URI, which will also ensure that the database is open and up to date. We 
are moving toward having opening the database be async, so there should 
be a single async method to get and open the folder and its database.

:rkent


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.mozilla.org/pipermail/tb-planning/attachments/20130122/91a86f40/attachment.html>


More information about the tb-planning mailing list