LTSLLC Blog: February 2017

Tuesday, February 28, 2017

New Versions of SSLTest on GitHub

I got the netty and non-netty versions of SSL test up on github at

https://github.com/ClarkHobbie/ssltest

(netty version)

https://github.com/ClarkHobbie/ssltest2

(non-netty version)

All hail netty!

Excelsior!

I have finally gotten TLS to work.

With the example up on GitHub that you can get from:

https://github.com/ClarkHobbie/ssltest2

My SSL/TLS test works!

Admittedly, this is without netty but still. If I can't get netty to work then at least I have that as a backup.

I will spend the rest of the day adapting my example to work with netty.

Monday, February 27, 2017

Wasting Away Again in TLSville

I spent (wasted) the day trying to get TLS working.

For the record, here are the commands for creating the keys:

openssl req -x509 -newkey rsa:2048 -keyout ca-key.pem.txt -out ca-certificate.pem.txt -days 365 -nodes
keytool -import -keystore truststore -file ca-certificate.pem.txt -alias ca -storepass whatever
keytool –keystore serverkeystore –genkey –alias server -keyalg rsa -storepass whatever
keytool –keystore serverkeystore -storepass whatever –certreq –alias server –file server.csr
openssl x509 -req -CA ca-certificate.pem.txt -CAkey ca-key.pem.txt -in server.csr -out server.cer -days 365 –CAcreateserial
keytool -import -keystore serverkeystore -storepass whatever -file ca-certificate.pem.txt -alias ca
keytool -import -keystore serverkeystore -storepass whatever -file server.cer -alias server

I have developed a simpler program that doesn't use netty. For the interested, I have put it up on Github at

https://github.com/ClarkHobbie/ssltest2

SSL/TLS can have an overpowering lure to it, and cause me to waste time trying to fix it; hence the wasted day. My posting to Stack Overflow has gotten neither votes nor help, leading me to believe that if anything is going to happen with this problem, I will have to do it.

All hail netty!

Sunday, February 26, 2017

Testing...Again

One situation that I came across while writing tests for Miranda is whether I should repeat myself. This is exemplified with in the file TestFileWatcherService.java. The problem is that the methods testCheckFiles, testFireChanged and testWatch are all the same.

The thing is that testCheckFiles also tests the fireChanged and watch methods, so these additional tests are redundant. The question then becomes whether to repeat tests that do the same thing, or just do them once. At first, I replicated code, but now I'm not so sure.

Going forward, I will collapse tests that do the same thing into one, since I have found that less code is a Very Good Thing.

TLS still doesn't work (see my post on Stack Overflow).

All hail netty!

Saturday, February 25, 2017

To Mock or not to Mock

As part of the whole testing process, I find myself in situations where I could use a mock objects or the real thing. I have already made some Big Mistakes by using other frameworks, and now I consider Mockito.

For those who are not familiar with it, Mockito is a framework for setting up mock objects. It was made with my situation in mind. The problem with it, is that you end up with blocks of code that have a lot to do with Mockito, but little to do with the system that you are testing.

On the other hand, I have ended up with large blocks of code that have a lot to do with testing Miranda, but little to do with Miranda itself. Thus the question remains: To use Mockito or not.

I think that, at this point, I will use Mockito and if it causes any problems, I will take it out.

TLS still doesn't work. See my question on Stack Overflow for updates.

All hail netty!

Friday, February 24, 2017

The Invalid Signature Problem

For some time now, I have been dealing with a problem where connections don't work when I try to use a local certificate authority with netty and transport layer security (TLS). The code for this problem is available on gihub at

  https://github.com/ClarkHobbie/ssltest

When I try and connect, I get the following exception:

io.netty.handler.codec.DecoderException: javax.net.ssl.SSLKeyException: Invalid signature on ECDH server key exchange message.

The complete commands are:

  java -cp target\ssl-test-1.0-SNAPSHOT.jar;netty-all-4.1.6.Final.jar Server

and

  java -cp target\ssl-test-1.0-SNAPSHOT.jar;netty-all-4.1.6.Final.jar Client

I have modified the program to work with "remote CAs" like google, running the program this way, doesn't work (google isn't setup to send messages), but I don't get the invalid signature exception either.

The complete command to run against google is:

  java -cp target\ssl-test-1.0-SNAPSHOT.jar;netty-all-4.1.6.Final.jar Client remote google.com 443

Turning off encryption entirely works, it can be done with the following commands:

  java -cp target\ssl-test-1.0-SNAPSHOT.jar;netty-all-4.1.6.Final.jar Server nossl

and

  java -cp target\ssl-test-1.0-SNAPSHOT.jar;netty-all-4.1.6.Final.jar Client nossl

I have tried a variety of things, none of which work. If anyone knows of a solution, I'm all ears. Till then I've posted a question on Stack Overflow at:

  http://stackoverflow.com/questions/42445115/invalid-signature-on-ecdh-server-key-exchange-message

All hail netty!

The complete exception is:

io.netty.handler.codec.DecoderException: javax.net.ssl.SSLKeyException: Invalid signature on ECDH server key exchange message
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:373)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:359)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:351)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:373)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:359)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:651)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:574)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:488)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:450)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
at java.lang.Thread.run(Unknown Source)
Caused by: javax.net.ssl.SSLKeyException: Invalid signature on ECDH server key exchange message
at sun.security.ssl.Handshaker.checkThrown(Unknown Source)
at sun.security.ssl.SSLEngineImpl.checkTaskThrown(Unknown Source)
at sun.security.ssl.SSLEngineImpl.readNetRecord(Unknown Source)
at sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source)
at javax.net.ssl.SSLEngine.unwrap(Unknown Source)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1097)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:968)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:902)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411)
... 16 more
Caused by: javax.net.ssl.SSLKeyException: Invalid signature on ECDH server key exchange message
at sun.security.ssl.HandshakeMessage$ECDH_ServerKeyExchange.(Unknown Source)
at sun.security.ssl.ClientHandshaker.processMessage(Unknown Source)
at sun.security.ssl.Handshaker.processLoop(Unknown Source)
at sun.security.ssl.Handshaker$1.run(Unknown Source)
at sun.security.ssl.Handshaker$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at sun.security.ssl.Handshaker$DelegatedTask.run(Unknown Source)
at io.netty.handler.ssl.SslHandler.runDelegatedTasks(SslHandler.java:1123)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1008)
... 18 more

Miranda and Encryption

Why does Miranda want to use a certificate authority? Why does Miranda use encryption at all?

Briefly, Miranda uses local certificate authorities to make it cheaper to use and easier to evaluate. Miranda uses encryption because events (messages) may contain things like personally identifiable information or other sensitive information.

The long answers are, well, longer.

First of all, if you don't have a requirement to encrypt things, then you can turn encryption off. Miranda was designed to be used on things like Amazon cloud, however, with traffic potentially going across the internet, so your events (messages) could be sent in the clear. If you are comfortable with that arrangement, then you can simply turn off encryption.

Miranda uses local certificate authorities because all nodes in the cluster are required to have certificates. It can quickly get expensive creating CERTs for every node in your cluster, not to mention inconvenient, with that approach. Instead, you create your own certificate authority and use the local CA to sign all your node keys.

Miranda uses encryption because I found myself in situations where I wished that its predecessor, Prospero, did. In particular, one of the obstacles to using Prospero in AWS was its lack of support for encryption. Another problem was crossing availability zones. If we had a node on the West coast, and another on the East coast, then they would probably talk across the internet.

As far as what to use, I thought SSL/TLS with their wide use, would be well supported, secure, cheap, and easy to use. While they are indeed well supported, secure and inexpensive I have not found SSL/TLS to be at all easy to use. I have run across a problem that has forced me to do all my development work "in the clear." I refer to the dreaded "Invalid signature" problem that I posted on Stack Overflow about.

At any rate, that is why Miranda uses local certificate authorities and encryption in general.

All hail netty!

Wednesday, February 22, 2017

Subsystems and Threads

Miranda uses threads that run state machines and which communicate through BlockingQueues.

Each subsystem has its own thread and runs a Consumer; which basically sits around, waiting for new messages from the thread's BlockingQueue. When the Consumer gets a new message, it hands it off to the Consumer's current state for processing.

There are two advantages to this arrangement. One, the module is inherently multi-threaded, and two, each state only has to worry about the messages relevant to it and not all the other messages that could occur.

When a state receives a message that it does not recognize, it asks the superclass to take a look at it and goes about normal processing. This allows you to factor out common behavior into super states (superclasses of states).

If I'm going to be honest about it, this is kind of the way Erlang does things, except in Erlang each object has its own thread and queue. Also Erlang is a functional language, that is, you can't do things like "i++" in Erlang; wheras Java is an imperative language.

One upside to this approach is that, in theory, testing should be very easy. You just send a message, wait a little, and then see what the subsystem sent in response. I say "in theory" because in practice instead you discover very embarrassing bugs.

I think this was a Good Idea, instead of A Very Bad Idea, but we shall see.

TLS is still not working.

All hail netty!

Tuesday, February 21, 2017

Miranda Testing

Currently, the strategy is to create a test class for each Miranda class, and put the tests for that class there.

When I say "every Miranda class," I don't actually mean it. Classes that don't have interesting behavior (or put another way, boring classes) are exempt.

One of the perks of testing, is getting a chance to refactor code. Classes which seemed like a good idea at the time can be "garbage collected" and removed.

TLS is still not working...

All hail netty!

Monday, February 20, 2017

Testing

Miranda is supposed to be an example of how I would like to go about designing and implementing a system.

This means testing.

Testing and I have a long and contentious history. The problem is that I'm lazy. I don't want to go to the extra effort of writing tests. In addition to writing the test itself, you have to setup preconditions and evaluate post conditions making the effort on par with writing the code that it is trying to test.

Bottom line: tests are as expensive as the code itself.

And if the underlying system changes, the the tests have to change as well; making changes even more expensive. Thus certain people (read me), want to delay writing tests until the system is "finished" (like that ever happens).

So I don't like writing tests.

But I'm getting to the point where one part of the system is stepping on another part when I try to test some aspect; so off I go to write tests.

Oh the joy.

Looking at the alternatives, I will use JUnit unless I find something better.

I can just taste the excitement.

All hail netty!

Sunday, February 19, 2017

Property Defaults

Miranda properties and their default values:

Miscellaneous properties.

These properties have "com.lsllc.miranda" as their prefix.

Name	Default	Description
DelayBetweenRetries	10000	The amount of time, in milliseconds that the sytem waits before retrying a connection to another node.
DeliveryDirectory	data/deliveries	The directory where deliveries files are kept.
FileCheckPeriod	1000	The amount of time, in milliseconds, that the systems wait in between chececks to see if any files have changed.
GarbageCollectionPeriod	3600000	The amont of time, in milliseconds, the system waits before doing a collection of old users, topics and subscriptions.
Log4jFile	log4j.xml	The log4j cofiguration file.
MaxWriteFailures	5	The number of times that the sytem tries to write out a file when before giving up, when shutting down.
MessageDirectory	data/messages	The directory where the system keeps the message files.
MessageFileSize	100	The number of events (messages) in a single event file.
MessagePort	443	The TCP port that the system listens for clients to send events (messages) to.
Network	mina	The network library that the system uses.
PropertiesFile	mirnda.properties	The properties file for the system.
SubscriptionsFile	data/subscriptions.json	The file the system uses for suscriptions.
TopicsFile	data/topics.json	The file where the system keeps the topics.
UsersFile	data/users.json	The file where the system keeps the users.

Cluster Properties

These properties pertain to the operation of the system cluster. They all share the prefix: "com.ltsllc.miranda.cluster"

Name	Default	Description
File	data/cluster.json	The file that contains the nodes that make up the cluster
HealthCheckPeriod	86400000	The period of time, in milliseconds, that the system checks if a node is up.
Port	6789	The port that the system waits for new nodes to connect on.
Timeout	604800000	The amount of time, in milliseconds, that a node can be down before the system concludes that the node is dead.

Encryption Properties

These properties effect the way that the system uses encryption. Each uses "com.ltsllc.miranda.encytion" as a prefix to its name. Entries in bold have no default value.

Name	Default	Description
CertificateAlias	server	The alias of the certificacte that the system presents to other nodes.
KeyStore	serverkeystore	The file that contains the server private key. The file must be in Java Keystore format.
KeyStoreAlias	server	The alias of the server private key.
KeyStorePassword	none	The password for the keystore.
Mode	localCA	The encryption mode used by the system.
Truststore	truststore	The file that contains the certificate used to sign the other certificates.
TruststoreAlias	ca	The alias of the certificate used to sign the other certificates.
TruststorePassword	none	The password for the truststore

HTTP Properties

These properties govern how the system uses HTTP and HTTPS. All properties have the prefix "com.ltsllc.miranda.http"

Name	Default	Description
Base	html	The directory that contains the system's HTML and other web-oriented files.
HttpPort	80	The TCP port that the system listens for new HTTP connections on.
Server	jetty	The web server to use.
SslPort	443	The TCP port that the system listens for new HTTPS connections on.

"My" Properties

These properties control what is sent with a join message. None of these properties have default values. All of these properties have the prefix "com.ltsllc.miranda.my"

Name	Default	Description
Description	none	A human-readable desciption of the node,
dns	none	The DNS name of the system.
ip	none	The IP address of the system.
Port	none	The TCP port that the system listens to for new nodes.

Panic Properties

These properties control how the system responds to panics. All these properties have the prefix "com.ltsllc.miranda.panic"

Name	Default	Description
Limit	3	How many recovable panics the system will handle before it terminates.
Timeout	3600000	The amount of time, in milliseconds, that the system waits before decrementing the panic count.

Saturday, February 18, 2017

Follow-up

To a recent post about how someone other than me is sending a response to an HTTP POST message that I am processing:

Some new facts have come to light, specifically

A successfully added object results in a 201 code being returned, but curl is getting back 200.
If I immediately respond to a post, I can force it to use whatever code I want.
Even when I send garbage JSON, I still get back a 200.

I love netty! I do! I do!

All hail netty!

Someone is Responding (and its not me)

With the incredibly simple approach that I am taking, when a new topic is created, it sends a message onto a class called, NewTopicHandlerReadyState, and it should send a reply when a new topic is added.

So far so good.

Now the interesting bit (and when I say interesting I mean annoying) is that this class's reply is not getting through.

In fact, I can't see where the reply that is getting through is coming from.

I have concluded that this is one of those interesting features that netty adds for free.

What was I thinking when I chose netty?

At this point, I feel like not fighting, and just embracing the insanity.

All hail netty!

Sunday, February 12, 2017

Why Use Miranda?

Miranda makes your systems more reliable.

In today's world, we are supposed to create systems that are obscenely reliable. Consider the following table:

# of Nines	Reliability	Time per Year
1	90%	36.5 days
2	99%	3.65 days
3	99.9%	9 hours
4	99.99%	53 minutes
5	99.999%	5 minutes
6	99.9999%	32 seconds
7	99.99999%	3 seconds
8	99.999999%	300 milliseconds

Can we provide 1 nine of reliability: sure. How about 2 nine? Probably. How about 3? Maybe. 4? We have to have round the clock pager support with very motivated and knowledgeable people. 5? Probably not.

It becomes practically impossible for human beings to keep a system up somewhere between 4 and 5 nines and impossible for an automated system between 7 and 8 nines of reliability.

At around 3 nines you have to have 7/24 support with people constantly checking and able to resolve problems almost instantly.

At 7 nines and above, the reliability of your network connection becomes an issue: consider how many hours you have spent trying to get that working!

Miranda gives you what appears to be 5 nines of reliability, provided your system does only HTTP POST, PUT and DELETE.

Miranda does this by recording POST/PUT/DELETE messages and then playing them back later. If your system crashes, Miranda will save the messages until it is back up.

Assuming you are using Miranda as a clustered service, Miranda itself should be very (5 nines) reliable.

Basically, Miranda was designed for people who are suddenly asked to provide some outrageous level of reliability without the necessary resources.

Friday, February 10, 2017

Deletes and Synchronization

After some reflection, I have made the following decisions:

Deletes, in the case of users, topics and subscriptions, cause the status of the object to be changed. Later on, they are garbage collected.
In the case of nodes, Miranda shall track the time of last connection. If a long enough period of time goes by without a connect, the node is dropped.
In the case of nodes, users, topics and subscriptions, a synchronization involves merging the list of objects with the remote list.
Objects marked for deletion are not merged.
Messages and deliveries are never deleted, but may be garbage collected.

So when a new node connects, Miranda merges its lists of nodes, users, topics and subscriptions with the new node.

The garbage collection period and the amount of time to wait before throwing away old node, users, topics and subscriptions is configurable.

Thursday, February 9, 2017

Cluster File Syncing

When a Miranda node connects to a cluster, it checks to see if its cluster file is up to date. If the remote file is more "up to date" than the local file, the new node downloads the remote file and updates itself. The new node gets the other cluster nodes and connects to them.

The problem is that the system depends on file modification times to decide which version is more recent. What if a new node's cluster file was edited more recently than the remote cluster file? This is not an unreasonable situation since the new nodes's files were probably modified so that the node could "see" at least one other node in the cluster,

What is needed is for the system to have some way to determine the "newness" of a file that doesn't depend on its modification time.

Wednesday, February 8, 2017

Basic Syncing

Basic syncing is now supported.

"Syncing" refers to comparing a remote version of an object to a local version. In this case, it refers to comparing the cluster files.

Miranda keeps versions by keeping a sha1 and a last modified of the various cluster files and comparing that. Two versions are considered to be the same if the sha1s match. If the sha1s differ then the more recent (newer last modified) version is used.

Eventually, users, topics, messages and deliveries will undergo syncing when two nodes connect.

Monday, February 6, 2017

To Do

This is a list of issues I need to address at some point:

Cluster can only get one version at a time.
GetVersion message can be sent by multiple nodes and "pass each other in the cloud."
The wrong port number is written to cluster.json
Sometimes two Json messages are sent in the same packet
NodeReadyState needs to handle the ConnectionClosedMessage
Null hostname in connectTo
Test com.ltsllc.miranda.cluster.Cluster.contains
Resolve the problem where the FileWatcherService wants access to MirandaProperties before the Writer has started.
SingleFileReadyReadyState's implementation of processGarbageCollectionMessage is wrong: it should check for "garbage" users, topic, and suscriptions.
Replace all uses of System.exit with panic.
Implement or get rid of SocketHttpServer.addServlets and NettyHttpServer.addServlets

Stategic Withdraw

After two weeks of wrestling with Netty TLS I'm going to go onto something else. The sad fact is that I cannot get TLS to work, and it's holding up progress.

The idea is to work with TLS turned off until I can get it to working. I will work on TLS every day, I will just not let it block other tasks.

For the curious, the problem is that the server does not seem to get messages. The client gets them, but they are gibberish.