Firefox Accounts DevOps next steps for November

Benson Wong bwong at
Tue Nov 5 13:23:11 PST 2013

Hi Lloyd, 

Some details on my rational on the ops side. Yes, it is to make things go faster. It's also to make things simpler while we get a grasp on what we really need to run the service. 

Some of my current ideas on HA for the service: 


- RDS w/ hot standby (synchronous replication).
- RDS snapshots shipped to another region. New feature, automated via API. Frequency TBD
- Start in us-west-2, which AFAIK, has been a very stable region.

The scariest scenario is when an entire region goes down and doesn't come back for many hours. In these cases, our recovery procedure will be: 

1. Decide: swap regions, or wait?
2. Do it. 
3. Decide: swap back, or leave it? 

On Content/Scrypt servers: 

- these are stateless AFAIK. So we can spin these up anywhere and re-point DNS quickly.
- the goal here will be making it dead simple to spin up, and redirect traffic.

Hope that answers your questions (and makes you less nervous). 


----- Original Message -----
From: "Lloyd Hilaiel" <lhilaiel at>
To: "Christopher Karlof" <ckarlof at>
Cc: "Benson Wong" <bwong at>, "Ryan Kelly" <rfkelly at>, "Mozilla Services Operations" <services-ops at>, dev-fxacct at, "Gene Wood" <gene at>
Sent: Tuesday, November 5, 2013 3:58:08 AM
Subject: Re: Firefox Accounts DevOps next steps for November

Not going multi-region from day one makes me nervous.  Technology selections which make it harder make me even more nervous.  Can we hit HA requirements without it?  

Is the rationale here simply to accelerate timelines?


