Important design principles for protocol design and architecture
The purpose of these design principles is to provide a protocol and architecture that is flexible and robust with regard to changes in deployed and functional components, as well as simple to work with. The principles are not rules, rather guidelines. If you need to break one of the principles, be sure you have considered it carefully, and preferably discussed it with others.
- Try to handle a single concept in each operation.
Example: Do not try to infer failures on the pillar, during the clients get operation. If state of the pillars need to be monitored by a client, make a separate operation for that.
- If your process needs a state machine to take decisions, try to have as few states as possible.
Example: In get, do you need the three cases: "Found", "Not found", "No reply"; or could "Not found" and "No reply" be handled as the same case?
- Try to let complicated logic be a part of a client, not of the protocol
Example: In integrity checking, do not make complicated messages communicating about expected state of the pillars. Rather make simple messages to extract state, and compare in the client.
- Simplicity of the pillars is more important than simplicity of the clients
Modularity and Independence
- Never expect any other module to be present
Example: When sending a message on the message bus, make no assumptions that any other module exists to pick it up.
- Assume other modules may answer wrongly
Example: If you receive a message with unexpected content, it is okay not to be able to handle it, but it is not okay if this breaks the process.
- You only know what is in the message. Do not rely on the SLA or previous messages to tell you something about other players in the conversation.
Example: Use messages to infer which players are available. Even then, you cannot assume those players are available when you send the next messages.
- Do not expect a reply to you messages.
Example: Instead of doing "Send message, wait for reply" as one operation, do "Send message" as one operation, and set up a trigger, that will take action on reply. You may also setup a process to handle timeouts, if the trigger is not activated.
- No operation should be able to crash the entire client/pillar
Example: Any operation should make sure it catches faults, like exception in Java
- No operation should update local state in a way that could prevent normal operation
Example: In the integrity client, if a local cache of data is preserved, care should be taken that corruption of this data does not prevent normal operation