Firefox Accounts DevOps next steps for November

Benson Wong bwong at mozilla.com
Tue Nov 5 13:23:11 PST 2013


Hi Lloyd, 

Some details on my rational on the ops side. Yes, it is to make things go faster. It's also to make things simpler while we get a grasp on what we really need to run the service. 

Some of my current ideas on HA for the service: 

On API HA:

- RDS w/ hot standby (synchronous replication).
- RDS snapshots shipped to another region. New feature, automated via API. Frequency TBD
- Start in us-west-2, which AFAIK, has been a very stable region.

The scariest scenario is when an entire region goes down and doesn't come back for many hours. In these cases, our recovery procedure will be: 

1. Decide: swap regions, or wait?
2. Do it. 
3. Decide: swap back, or leave it? 

On Content/Scrypt servers: 

- these are stateless AFAIK. So we can spin these up anywhere and re-point DNS quickly.
- the goal here will be making it dead simple to spin up, and redirect traffic.

Hope that answers your questions (and makes you less nervous). 

Ben.

----- Original Message -----
From: "Lloyd Hilaiel" <lhilaiel at mozilla.com>
To: "Christopher Karlof" <ckarlof at mozilla.com>
Cc: "Benson Wong" <bwong at mozilla.com>, "Ryan Kelly" <rfkelly at mozilla.com>, "Mozilla Services Operations" <services-ops at mozilla.com>, dev-fxacct at mozilla.org, "Gene Wood" <gene at mozilla.com>
Sent: Tuesday, November 5, 2013 3:58:08 AM
Subject: Re: Firefox Accounts DevOps next steps for November


Not going multi-region from day one makes me nervous.  Technology selections which make it harder make me even more nervous.  Can we hit HA requirements without it?  

Is the rationale here simply to accelerate timelines?

lloyd



More information about the Dev-fxacct mailing list