/elements

The elements of a good information system

The elements of a good information system

The purpose of an information system is triple: 1) preserve data; 2) present data; 3) transform data.

I outline the seven properties I believe are essential in any good information system.

  1. Recoverability of data: the data should be preserved in a recoverable way that contemplates parts of the system failing. Only a highly unlikely event should result in a loss of a significant part of the data.

  2. Understandability of data flows: the data flows should be understandable in a crystal-clear way.

    • Data in transit: Each endpoint, whether REST or a queue, should show the inputs and outputs. Errors are also outputs.
      • We should know which endpoint being hit triggers calls to other endpoints. A flow is a list of endpoints hit in sequence.
      • The inputs and outputs should be shown as serializable data. No abstractions!
    • Data at rest: show the structure of the database or structured files.
    • This also goes for communication with third parties.
  3. Testability of the system: the system should have complete and almost fully automatic tests.

    • There should be a test suite that covers all the possible cases of the code.
    • Because of the potential infinite inputs, to achieve completeness the testing needs to be written in a white-box way, to match the order of execution.
    • Everything should be automatic, except for tasks requiring manual authentication when integrating with third-party providers.
  4. Simplicity of implementation:

    • Minimize lines of code and files.
    • Minimize dependencies.
    • Minimize technologies.
  5. Observability of operations:

    • All the errors generated by the system should be available.
    • All the logs and metrics of the systems should be queryable.
    • Any abnormal situation should be immediately reported to the responsible parties.
  6. Scalability:

    • The system should be able to be deployed automatically.
    • The system should be able to grow in data flow and data at rest.
    • The tradeoffs between consistency and availability should be minimized and should be defined explicitly.
  7. Performance:

    • Any operation that should have no reason to be slow should be very fast.
    • Any operation that has reason to be slow should be reasonably fast.