Feedback requested on Services Metrics proposal

Rob Miller rmiller at mozilla.com
Fri Oct 14 13:48:38 PDT 2011


On 10/14/11 1:11 PM, Gregory Szorc wrote:
> On 10/14/11 11:41 AM, Rob Miller wrote:
>> Actually, what you've described is almost precisely what I intended
>> with the current proposal. What I call "metadata" is what you called
>> "arbitrary key-value pairs". The "message" part is the payload, and it
>> can be a string, JSON, or anything that can be serialized and sent
>> over the wire.
>
> Ahhh. I was thinking the payload field(s) would be typed (i.e. no need
> to look at the type of a field, just the presence). Maybe have payload
> exist as a separate field altogether, enforcing the separation of
> envelope from payload. e.g.
>
> {
> logger: "foo",
> metadata: { ... },
> severity: 3,
> type: "request.finish",
> payload: {
> string: "Request finished in 0.002 seconds",
> fields: {
> duration: 0.002
> }
> }
> }

Again, we're in almost complete agreement.  Here's the structure as I 
currently imagine it:

{
   logger: "foo",
   tags: { ... },
   severity: 6,
   type: "...",
   env_version: 0.1,
   message: "...",
   timestamp: "2011-10-14T20:36:54.609806"
}

There are some naming differences ("payload" may be a better choice than 
"message", but their semantic meaning in the structure is identical), 
but the only real difference is that I don't imagine defining the 
structure of the payload at all, we just say that it's string data. 
That string may (and often will) be a nested JSON object, with its own 
rich structure, but I don't see that as a concern of the spec itself, 
but rather as something that will flesh itself out as we build a corpus 
of message types.

>> The only difference between what you described and what I've proposed,
>> AFAICT, is that you pulled "type" and "labels" out into top level
>> pieces of the envelope, while I imagined them as embedded within the
>> arbitrary key-value pairs.
>
> That is what I proposed for consideration in my initial email, yes. The
> argument I am throwing out there is that the stronger you define your
> messages, the easier it becomes for downstream systems to process and
> consume said messages. Without a required message "type" (presence in
> metadata/tags/labels/whatever would imply it being optional as I am
> reading things), you would have 1 class of messages with types that are
> easily consumable (effectively a switch statement in code) and those
> that aren't (requiring inference of message type from other fields or
> payload). You quickly realize inference sucks (redundant and/or
> inconsistent logic across downstream agents, imperfect detection,
> sensitive to seemingly innocuous schema changes, less overhead to
> calculate, etc), so, you end up adding "type" on all messages to avoid
> it. At this point, you wonder why you didn't avoid this mess altogether
> by making "type" required.

Yup, I've already agreed that type makes sense as a top level field 
rather than something living in the metadata.

> To address a point JR made, "type" isn't perfect (it is also vulnerable
> to schema changes over time). But, these can be combated with a good
> review process, a central catalog to define types (or tuples that
> constitute a unique type), and versioning types or payloads when they
> inevitably change (either version the type name - add V1 - or add a
> "version" field to the payload for that specific type).

An "envelope version" value is specified in my data structure, above, 
which will allow us to demarcate changes in the envelope structure.  The 
structure of specific types of messages can and should be versioned, as 
well.  I'm inclined to let that be optional, i.e. embedded in the 
message tags, but could possibly be convinced that a "msg_version" field 
belongs at the top level.

-r


More information about the Services-dev mailing list