Sunday, April 30, 2017

To Divide or not to Divide?

I was thinking of making the part of Miranda that serves web pages separate from the rest of the system.

On the pro side this makes Miranda a bit more secure and reliable.  The web side can go down without bringing down the rest of the system.  If an attacker gains control of the web side the rest of the system is OK.

On the con side, Miranda will still have to host servlets, and they will probably be hosted by the same software as the web pages (Jetty).

Decisions, decisions.

I will divide Miranda into two parts because it offer more security and flexibility.  The Miranda side can use something that is optimized for servlets while the web side can use something optimized for HTML.

Saturday, April 29, 2017

Miranda & Logins

Miranda doesn't do passwords.

This is partially for security reasons (a system that deals in password may not be secure) but mostly because passwords are a pain.

Passwords are a pain because to do them properly, you can't store them in cleartext (an attacker who gained access to the system could get them) and, even encrypted or hashed they are still sensitive information.

Since Miranda already requires user to register a public key it uses that.  When a user "logs in" they simply provide their user name.  The system determines whether they already have an active session, If they do, the system returns it.  If they do not have an active session the system creates one.

When the system creates a session, it uses a random, 8-byte integer to identify it, the session id. When the system hands the session id back to the user, it first encrypts the value with the user's public key.  When the user gets this encrypted value, they decrypt it with their private key.

The user supplies the session id with all their requests, so the system validates their identity.

It is extremely unlikely that an attacker could guess a valid session id, and most operations could be modified to limit the number of failed attempts.  The exception are new events which require the system to process them as quickly as possible.

The alternative  is to detect when a large number of failed uses of session ids has occurred. Something for my todo list!

Tuesday, April 18, 2017

Introducing Miranda: Certificate Authorities

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop. At this point, I'm filling in the gaps so things may skip around a lot.

By default, you define a new certificate authority when you install Miranda.  The new certificate authority signs the certificates of all the Miranda nodes so that they can join the cluster.

When a node tries to join the cluster, it is asked to present a certificate. The cluster checks that the certificate is signed by the certificate authority before the node is allowed to join the cluster.  When clients contact the system with the web interface, this is also the certificate that is used by SSL/TLS.

The certificate authority itself can be signed by something like Verisign or it can be self-signed.  The default is to use a self-signed certificate.  This allows people to take Miranda for a "test drive" without requiring them to get a certificate first.


Sunday, April 16, 2017

Operations

Everything goes through the Miranda object.  This is because the Miranda object knows to tell the rest of the cluster about things like new Sessions and new Events. This also makes for a cluttered Miranda class since it has to know about the details of lots of operations.

The solution, as I see it, is to create a temporary subsystem for each operation: an Operation class. An object capable of receiving messages is required, hence a subsystem, and it knows about the details of the operation, thus the Miranda class is less cluttered.

The first of these operations is the login operation.  It is created when the Miranda object is asked to perform a login.  It looks up the user and creates a session for them.  If the user doesn't exist, then it signals this and terminates.

Saturday, April 15, 2017

Introducing Miranda: Frequently Asked Questions

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.  I thought I would pose and answer some miscellaneous questions.
  • Why didn't you develop Miranda at Pearson?
  • Does Miranda work with AWS?
  • What is in store for the future of Miranda?
  • Miranda really needs feature X!

Why Didn't you Develop Miranda at Pearson?

My managers were always talking about doing this in the next release.  When that rolled around they would talk about the one after that and so on.

Does Miranda Work with AWS?

Miranda was designed with AWS in mind so it should work with AWS.

What is in Store for the Future of Miranda?

That depends on whether it takes off as open source.  For right now, I am planning on adding some sort of rewriting syntax for subscriptions.

Miranda Really Needs Feature X!

If a feature is asked for enough times, I will added it to the system.  If you want something added right now, then you can fork Miranda on github. If you want me to do it, then you can hire me as a consultant to move your feature to the front of the list.

Friday, April 14, 2017

Introducing Miranda: How it Works Revisted

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.  After going through a dry run, it struck me that I talked too much about the nuts and bolts of Miranda, and not enough about how it work generally.  Therefore, I am redoing the "How it works" section to cover these issues better.

Miranda works by making the web service behind it appear to be more reliable than they are.  It does this by sitting in front of the web service and accept Events on its behalf.  Miranda itself is a distributed, fault-tolerant system and is very reliable.

Later, when the service is ready, Miranda delivers those events.  Thus the underlying service does not have to be functional all the time, Miranda will accept Events for it while it is down.  If an event causes the service to crash, Miranda can put the Event aside after the problem is fixed.

Miranda stores Events on a cluster of systems. If one node fails, the other nodes have a copy of the Event.  Nodes take responsibility for delivering Events during an "Auction".  If a node that is tasked with delivering events to a service goes down, the Subscription is Auctioned off again to another node.

Wednesday, April 12, 2017

Introducing Miranda: Miranda Asks for a Password at Startup

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.  At this point, I'm filling in gaps, so things might jump around a bit.

Passwords present a bit of a problem for the system.  Storing them in a file makes them insecure, but the system needs them to encrypt and decrypt files.  The solution that Miranda uses is if it lacks a password, to ask the user for it as part of its startup routine.

This does require a human being to start the system, but it will hopefully be infrequent enough that it will not be too much of a pain.

Introducing Miranda: Miranda is Distributed

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.  At this point, I'm filling in gaps, so things might jump around a bit.

One of the design goals of Miranda was to make it able to cross availability zones.  This was due to a limitation of the previous system, Prospero.  Prospero had trouble crossing availability zones because the database it used, Mnesia, did not like latency.  Furthermore, Prospero used RabbitMQ a lot and every round trip meant network delays.

Miranda does not use Mnesia or RabbitMQ, so it does not have these problems.

If a hurricane takes down a data center, the remaining nodes will keep going.  Subscriptions that the lost node was responsible for will be distributed among the remaining nodes.

When the down node comes back online, the system will "fill it in" on what happened while it was down. More specifically, when a node joins the cluster, it sends and receives the SHA1s of the cluster, users, topics, subscriptions, events and deliveries of the system.  If the SHA1 that it has locally does not match a remote SHA1, then the system tries to merge the remote file with its local file.

When a node joins or leaves the cluster the online nodes hold an election.  During the election, the subscriptions are distributed to the various nodes of the cluster.

When an Event or Delivery takes place it is shared among the members of the cluster. That way, any node can host any Subscription.

Changes to Users, Topics, and Subscriptions are also shared among the cluster members.

Tuesday, April 11, 2017

Introducing Miranda: How it Works

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop. At this point, I'm filling in the gaps, so things might jump around a bit.

Miranda works by

  • Sitting in front of a web service and accepting HTTP Events on its behalf.
  • It then delivers those Events to the web service when it is available.
  • The admin user defines other Users.
  • Users define Topics.
  • Other Users Subscribe to the Topics.
  • Users send HTTP Events to those Topics
  • Miranda delivers those Events as part of a subscription.

Introducing Miranda: Why Miranda Wont Crash

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.  At this point, I'm filling in gaps, so things might jump around a bit.

Miranda wont crash for the following reasons

  • Miranda is distributed.  It can lose one availability region and keep going.  When a node does come up, it will receive all the information it missed.
  • Each subsystem runs in its own thread. If one thread crashes, the other threads can keep going.
  • Miranda uses panics instead of System.exit.  That way, a lone method cannot terminate the system.
  • Miranda catches Exception.  So an uncaught exception wont crash the system.
  • Miranda has lots of tests. So a bug is unlikely to cause trouble.

Monday, April 10, 2017

Introducing Miranda: Why You Want Miranda

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

I have gone over a lot of topics concerning Miranda including:

  • The motivation for creating it
    • We want 9 9s of reliability
    • But we don't want to pay for it
    • Prospero gives us between 5 and 6 9s of reliablity but has problems
    • Miranda was created to address those problems
  • How it works
    • Admins create a local certificate authority
    • Admins create users
    • Users create Topics and Events
    • Users create Subscriptions
    • Miranda delivers Events to users
  • Why it wont crash
    • Miranda is distributed and fault-tolerant
    • Each subsystem runs in its own thread
    • Threads catch Exception
    • Miranda uses panics instead of System.exit
  • Why it's secure
    • Opensource
    • Miranda uses SSL/TLS for all communication
    • Miranda encrypts all files
    • Miranda asks for a password at startup
The conclusion is to use Miranda when your in the "we want 9 9s of reliability" situation.

Introducing Miranda: Why Miranda is Secure

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

At some point, people will need a reason to trust Miranda, this post gives that reason.

Miranda should be trusted because it is open source, because it uses HTTPS/TLS for communications, and because it encrypts its files.

Open source is more trustworthy than closed source because you can see the code that is performing the operations.  Users can build the system themselves and see that there is no malicious code.

Miranda uses SSL/TLS to communicate with other nodes.  That way, an attacker cannot see what is going between nodes.  Miranda also uses SSL/TLS when clients send new Events to the system, so attackers cannot see Events sent to the system.

Miranda encrypts all its files, so if an attacker gets a hold of one, it wont do them any good.

Finally, Miranda has the capability to ask for a passpharse at startup, so users don't have to store any secure information in files.

Sunday, April 9, 2017

Introducing Miranda: Prospero and Miranda Security

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

Prospero supported limited security.  Admins could connect to it, log in, create new users and perform other administrative operations with a web browser.  It used HTTP for everything, however, and all its communications were in plain text. It also used plain text to talk to RabbitMQ.

Messages (POSTs) sent to it had to be signed with a symmetric key.  Admins therefore had access to all the user keys and they are stored in plain text in the database.

Miranda is more serious about security.  When the system is first installed a new certificate authority is created.  This CA is used to sign the certificates that the various nodes present when they join the cluster.

All users have a key pair and to do anything, they must first create a session.  The session is a random, 8 byte integer that is encrypted with the user's public key when it is handed back to the user.

All communication going into Miranda is encrypted using SSL/TLS.  Communications coming out of Miranda depend on the subscription: it can be HTTP or HTTPS.

Introducing Miranda: Unit Tests

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

Miranda has many Junit tests.  316 at last count.  This does not guarantee that there are no bugs, but it does make it more likely.

A typical Miranda test checks that a Message gets to the method expected and is processed as expected.

I have also been pleasantly surprised by how smoothly Mockito has worked.  With it, I can focus on one class to test at a time and mock everything else out.

Miranda testing has also been simplified by the use of BlockingQueues and Messages.  A class method typically does a few things, then sends out a Message in response to an event.  Thus the test has to check that it took those actions.  The tests tend to be simple things because of this.

Thursday, April 6, 2017

Introducing Miranda: Exceptions and Panics

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

In Java, it is generally a bad idea to catch unchecked exceptions.  This is because a unchecked exception usually means something Very Bad has happened, like running out of memory, and it's time for the program to terminate.

So why does Miranda catch unchecked exceptions?

First of all, Miranda only does this in a few places.  Most notably, Miranda catches unchecked exceptions in the main loop, where it is getting the next message and processing it.  Since Miranda isn't supposed to crash, this is Miranda's last line of defense against a runaway subsystem taking everything down.

Secondly, Miranda's usual response to an exception is to create a panic.  I got the term from my days with Unix, where the operating system would, when it got into a bad state, "panic" and shut down.  In Miranda, when a panic occurs, the system can decide to halt immediately, shut down, or try to keep going.

If Miranda tries to keep going, it keeps track of these "recoverable" panics, and stops if the become too numerous.

Introducing Miranda: Threads and Subsystems

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

Miranda is no supposed to crash, ever.  With a system whose downtime per year is measured in minutes, the entirety of the system needs to be looked at.  For this reason, all subsystems run in there own thread.  That way, if one thread crashes the rest can keep going.

Steps have to be taken to make sure that a stray unchecked exception doesn't take down the system, but it does make things more stable.

This represents a problem from a language standpoint, since languages like Java are not good at communicating between threads.  Things like synchronized methods make this easier, but still not ideal.

This is why Miranda uses BlocingQueues and Messages when subsystems need to communicate. Using this model, a subsystem simply takes the next Message from its queue and processes it.  If no Messages are available, then it waits.  Subsystems process each message before going onto the next, so there are no interrupts.

All this makes Miranda reliable, but cumbersome.  The difficulty in developing for Miranda is eclipsed by the need for reliability.

Wednesday, April 5, 2017

Introducing Miranda: How it all Fits Together: Deliveries

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

When a new Event is created, the Subscriptions examine it to see if they are interested in it (actually, they just look and see if they are subscribed to the topic).  If a Subscription is interested, it sends the event onto a delivery thread with the subscriber's URL.  The delivery thread sends the event to the client, who responds with a 2xx to signal that they got the event.

Graphically:

Subscriber      Delivery        Subscription    Event           Delivery
                Thread                          Manager         Manager
|               |               |               |               |
|               |               New Event       |               |
|               |               |<--------------- font="" nbsp="">
|               Deliver(Event)  |               |               |
|               |<--------------- font="" nbsp="">
|POST           |               |               |               |
|<--------------- font="" nbsp="">
(2xx)           |               |               |               |
--------------->|               |               |               |
|               (success!)      |               |               |
|               --------------->|               |               |
|               |               (success!)      |               |
|               |               --------------->|               |
|               |               |               New             |
|               |               |               Delivery        |
|               |               |               --------------->|


|               |               |               |               |

Introducing Miranda: How it all Fits Together: New Events

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

Putting it all together, when a client sends an Event (we'll use a POST for this example) to a Miranda Topic, the system records the new Event and starts a write.  At the same time, it "tells" the Cluster about the new Event.

Graphically (primitively) it looks like this:

Client          Event           Event           Writer          Cluster
                Listener        Manager
|               |               |               |               |
POST            |               |               |               |
--------------->|               |               |               |
|               new POST        |               |               |
|               --------------->|               |               |
|               |               write events file               |
|               |               --------------->|               |
|               |               new POST        |               |
|               |               ------------------------------->|
|               (UUID)          |               |               |
|               |<--------------- nbsp="" span="">
(UUID)          |               |               |               |
|<--------------- nbsp="" span="">
|               |               |               |               |

Tuesday, April 4, 2017

Introducing Miranda: Deliveries

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

A Delivery represents an Event which has been sent to a subscriber and accepted.

A Delivery consists of:

  • The Event the Delivery is associated with.
  • When the Delivery occurred.
  • An attempt id (UUID)
Deliveries are "batched" together into groups of 100 (default) to form a delivery file.

Deliveries are created when the system is able to send an Event to a subscriber and the subscriber responds with a 2xx result.

All Deliveries go through the DeliverManger, which is in charge of all Deliveries and deliver files.

Introducing Miranda: Events

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

Events are the POST/PUT/DELETE messages that are sent to topics.

Events have attributes they are:

  • The type of event (POST, PUT or DELETE).
  • When the Event occurred.
  • Who created the event.
  • The Event contents (for POST and PUT).
  • An id (a UUID).
Events are "batched" together into groups of 100 (default) to form an Event file.

An Event is created when a User performs a POST/PUT/DELETE to a Topic.

Events are managed by the EventManager. When a new event is created, it goes through the EventManager.


Monday, April 3, 2017

Introducing Miranda: Subscriptions

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

Subscriptions ensure that Events sent to a Topic also get sent to the subscriber.

They consist of the user that created them, a URL where the events should be sent, a liveliness URL, and some attributes.

Users who can create subscriptions are called subscribers.  They can use the web site or the API to create a subscription.

When an event is delivered to the URL associated with a subscription and Miranda gets back a 2xx result, it is called a Delivery.

A subscription has attributes that control its behavior such as:

  • What to do with failed Events.  Does the system keep trying to deliver the Event, or does it put the failed Event into a list of failed Events.
  • How does the system handle recording Deliveries?  Does it move on after a Delivery has occurred, or does it wait for the delivery to be written to persistent store?
  • How does the system deal with other nodes?  Does it wait for acknowledgement or for  the other nodes to write the Delivery?
The owner of a subscription or the admin can change it through the web site or the API.

A subscription can be deleted by the owner or the admin.  All messages that have not been delivered are lost.

Elections determine which node handles deliveries for the subscription.

If the system cannot deliver an Event for a subscription because it did not receive a reply from the URL it tries to contact the subscriber via the liveliness URL. If it fails to contact the liveliness URL, it keeps trying, waiting twice as long between attempts until it reaches a limit of 5 minutes (default).

It keeps trying the liveliness URL every 5 minutes until it gets a 2xx result, at which time it resumes delivery of Events.  Miranda keeps Events for one week (default) before discarding them.

Introducing Miranda: Topics

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

Topics are endpoints that users send POST/PUT/DELETEs to.  They each have the name that they are created with and a UUID to identify them.

Other users can subscribe to a Topic to get the messages sent to it.

Topics can have attributes that control how messages are managed including

  • Whether a response is sent when an Event is received or when it is written to persistent store.
  • Whether the topic waits for the other members of the cluster to receive/acknowledge/write an Event.
Topics are created by users with that ability from the website or via the API.

The owner of a topic can modify it.

To delete a topic the owner of the topic must request removal through the web site or the API.  The Topic must have no subscribers.

The admin user can delete a topic with subscribers.  In that case, the subscriptions are also removed.

Sunday, April 2, 2017

Introducing Miranda: Sessions

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

Before a person can do anything in Miranda they must login and create a new Session. The user supplies a user name and the system responds with an encrypted, random long value, a session. The session is encrypted with the user's public key.  The user must accompany all their requests with the (decrypted) session.

The SessionManager keeps track of all the sessions.  It also is responsible for telling the cluster about new Sessions.  A session lasts for one hour (default).  The SessionManger checks for expired Sessions every 5 minutes (default).  If a Session is used, its expiration time is adjusted to give it an hour from that point until it expires.  When the SessionManager expires a Session, it tells the cluster about it, so the Session will expire on all nodes.

Introducing Miranda: the Cluster

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

The Cluster is the collection of all the Miranda nodes that the local system knows about.

It acts as a repeater: when the system wants to tell the other nodes about something, like a new user or a new topic, then a message is sent to the Cluster.

Elections, where ownership of the systems subscriptions is decided, is done by the Cluster.

When a new node connects to the system, the Cluster is notified.

Saturday, April 1, 2017

Introducing Miranda: the Network

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

The Network subsystem facilitates communication between the different nodes of the system.

It uses integers (handles) to identify the various connections.  When the local system wants to send some data to another node it sends the handle along with the message.  The network looks up the connection using the handle and sends the message.

Handles keep the rest of the system from knowing too many of the details of the networking library. They were created during the period when Miranda was switching from Netty to Mina.

The Network subsystems also covers the part of the system that listens for new nodes.  When a new node connects, the Network tells Miranda and the Cluster about it.

Introducing Miranda: the Major Subsytems

In preparation for a talk I'm giving at DOSUG I'm going to post my thoughts as they develop.

The major subsystems of Miranda include:
  • Miranda
  • Network
  • Cluster
  • SessionManager
  • UserManager
  • TopicManager
  • SubscriptionManager
  • EventManager
  • DeliveryManager
The Miranda subsystem servers multiple purposes.  It is the crossroads for a lot of messages and it maintains the state for the system.  When a new node joins the cluster, the Miranda subsystem maintain where in the process of syncing we are.

The Cluster represents the other Miranda nodes as a whole. When something happens locally that we want to tell the other nodes about, like the creation of a new session, a message is sent to the cluster.

The Network subsystem is used to communicate with the other Miranda nodes in the system. When a node wants to send another node some data, it sends a message to the Network.

The SessionManager is responsible for Sessions in the system.  A Session is created when a user logs in to Miranda.  When that happens the session manager also tells the Cluster about the new session.

Each of the remaining subsystems manages a set of objects.  The UserManager manages Users and so on.  The other subsystems also have a file associated with it.  The file holds the collection, and the associated manager monitors it for changes.

The UserManager is also consulted during the logon process.

The Events and Deliveries managers are different in that they are in charge of directories instead of single files.  Events and Deliveries are batched together into files that the Events and Deliveries managers are responsible for.