asutherland at asutherland.org
Mon Jun 14 18:30:32 UTC 2010
On 06/14/2010 10:15 AM, Blake Winton wrote:
> DavidA let me know that we aren't indexing on date, so this query will
> run slower than it might otherwise, and hopefully that can be fixed
> fairly easily.
For clarity purposes, there is an index on the date column. But we
would need an index over the folder id and date together for it for your
query to be cheap. The query probably spends a lot of time copying
message rows into an ephemeral result table sorted by date to satisfy
the ordering constraint and then perform the limit. The one upside is
it only ever needs to keep 50 rows in the table at a time, but there
still is a lot of busywork.
> The other idea I had (and I don't know if it's technically feasible),
> was for partial updates to be sent through onItemsAdded, before the
> first query is finished. Currently, I get a long pause, followed by a
> single call to onItemsAdded, and then immediately afterwards, a call
> to onQueryCompleted. If I could get the data fed to me sooner, piece
> by piece, I believe that would give the user the impression of us
> being faster than we are.
That's unlikely to make a major difference. SQLite can't start
generating message result rows until it has seen literally every message
row in the folder, and then it will zip through them quickly. The
potential savings of chewing the messages in smaller bites is that the
follow-on conversation and identity/contact/identity queries we issue
can be pipelined. I'd be interested in seeing how long those queries
take to service before doing too much more work with that. Assuming you
have gloda debug logging enabled, I think that should be pretty easy to
(Note that most of the logic is in place to take smaller bites and it
might even be a matter of constants, if you want to try your hand at
that. I still would like numbers though :)
> And in a related suggestion, some way of paging through the results
> would let me load fewer initially, and then add more later.
Yeah, this is required for the virtual widget list mechanism too. We
could do this within the current schema by putting the output of the
query into a temporary table and then slicing that instead of using
limits. Unfortunately the up-front query would still be pretty
expensive as you're currently seeing. We'll need a schema change with a
full reindex required to make sure we bake dates into all of our indices
to improve performance. I was originally hoping to get that in for 3.1
but that didn't work out at all.
I suspect I'm going to end up creating a (p)branch for changes to gloda
in the near future, as there's a lot of stuff that needs to happen. We
should be able to improve the treatment of conversations in the process too.
One very useful thing would be if you can figure out what other
efficient queries your UI is going to want for always-available instant
access and what can perhaps suffer some on-demand processing. With
dates on the queryable indices we should be able to convince SQLite to
do some pretty efficient intersections even when one of the query fields
has a pretty large return set, but there are cases where we're going to
need to generate synthetic/aggregate attributes to be indexed or opt for
filtering. For example, unread status is never going to be fast enough
on straight set intersection; we need to specialize on that one.
More information about the tb-planning