In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop. At this point, I'm filling in gaps, so things might jump around a bit.
One of the design goals of Miranda was to make it able to cross availability zones. This was due to a limitation of the previous system, Prospero. Prospero had trouble crossing availability zones because the database it used, Mnesia, did not like latency. Furthermore, Prospero used RabbitMQ a lot and every round trip meant network delays.
Miranda does not use Mnesia or RabbitMQ, so it does not have these problems.
If a hurricane takes down a data center, the remaining nodes will keep going. Subscriptions that the lost node was responsible for will be distributed among the remaining nodes.
When the down node comes back online, the system will "fill it in" on what happened while it was down. More specifically, when a node joins the cluster, it sends and receives the SHA1s of the cluster, users, topics, subscriptions, events and deliveries of the system. If the SHA1 that it has locally does not match a remote SHA1, then the system tries to merge the remote file with its local file.
When a node joins or leaves the cluster the online nodes hold an election. During the election, the subscriptions are distributed to the various nodes of the cluster.
When an Event or Delivery takes place it is shared among the members of the cluster. That way, any node can host any Subscription.
Changes to Users, Topics, and Subscriptions are also shared among the cluster members.