/system-design-resources

Web resources on microservice architecture. Review of frameworks, libraries and tools

About

This is a list of web resources on technologies tied (directly or proximitly) with architecture and design of microservices.

Total more than 100 resources aggregated

I chose mostly those technologies created on the basis of Java/Scala, but this is a sort of personal choice - microservices, as well as all related services, can be written in any language.

You can extend this list - see the Contribution section.

Overview

Microservice architecture is a vast topic, so I created a hierarchical content to get a high-level grasp of this realm as the whole. All resources are divided on groups, containing several topics. Each topic in its turn contains a few references.

General topics:

  • REST libraries/frameworks
  • Asynchronicity
  • Messaging
  • Security
  • CI/CD
  • Deployment/Orchestration

Microservice cloud's components:

  • Configuration
  • Service discovery
  • API gateway
  • Load balancing
  • Resilience

Observability:

  • Metrics
  • Distributed Tracing
  • Monitoring
  • Alerting
  • Log management

Testing:

  • Unit Tesing
  • Integration Tesing
  • Load testing
  • Security testing

Resources

General topics


REST libraries/frameworks

  • Apache CXF | used to build and develop services using frontend programming APIs, like JAX-WS and JAX-RS
  • Dropwizard | out-of-the-box support for configuration, application metrics, logging, operational tools, and more
  • Spark | a micro framework for creating web applications in Java with a minimal efforts, through DSL
  • Play Framework | is a high velocity, hyper-productive web framework for Java and Scala, with powerful templating engine and other features
  • Akka HTTP | a full server-side and client-side HTTP stack implemented on top of actor model
  • Vert.x | a tool-kit for building sophisticated modern asynchronous applications and HTTP microservices

Asynchronicity

Messaging

  • Axon Framework | a lightweight, open-source Java framework to build scalable, extensible event-driven applications
  • Lagom | an opinionated, open source framework for building reactive microservice systems in Java/Scala
  • Akka | a toolkit for building highly concurrent, distributed, and resilient message-driven applications for Java and Scala
  • ZeroMQ | a brokerless messaging middleware (allows to connect sockets N-to-N with patterns like fan-out, pub-sub, task distribution, and request-reply)
  • Apache Kafka | the message broker: horizontally scalable, fault-tolerant, super-fast, and is able to digest fantabulously massive volumes of messages.
  • RabbitMQ | a classical examples of the message brokers which speak AMQP protocol.
  • NATS | a simple, high performance open source messaging system for cloud-native applications, IoT messaging, and microservice architectures. It implements a highly scalable and elegant publish-subscribe message distribution model
  • NSQ | an open-source realtime distributed messaging platform, designed to operate at scale and handle billions of messages per day. It also follows a broker-less model and as such has no single point of failure, supports high-availability and horizontal scalability.

Security

  • Spring security | a powerful and highly customizable authentication and access-control framework. It is the de-facto standard for securing Spring-based applications.
  • Secret Vault project | secures, stores, and tightly controls access to tokens, passwords, certificates, API keys, and other secrets in modern computing, see also https://learn.hashicorp.com/vault/#getting-started
  • Clair | an open source project for the static analysis of vulnerabilities in application containers
  • KMS (Google) | a dedicated service to manage encryption keys, Key Management Service
  • AWS Security products | amazon has the whole line of high-quality products tied with security

Identity providers

  • Keycloak | an open source Identity and Access Management solution aimed at modern applications and services. It makes it easy to secure applications and services with little to no code.
  • WSO2 Identity Server | an extensible, open source IAM solution to federate and manage identities across both enterprise and cloud environments including APIs, mobile, and Internet of Things devices

CI/CD

  • Jenkins | one of the most widely deployed continuous integration (and continuous delivery) platforms
  • Bazel | an open-source build and test tool similar to Make, Maven, and Gradle. It uses a human-readable, high-level build language. Bazel supports projects in multiple languages and builds outputs for multiple platforms. Bazel supports large codebases across multiple repositories, and large numbers of users.
  • Buildbot | a job scheduling system which supports distributed, parallel execution of jobs across multiple platforms, flexible integration with different source control systems and extensive job status reporting
  • Jenkins | one of the most widely deployed continuous integration (and continuous delivery) platforms

Deployment/Orchestration

  • Docker | an exceptionally lightweight (comparing to traditional virtual machines) architecture, with little to no overhead, allowing to share the same operating system kernel without extra hardware support. Docker can be served as a replacement of virtual machines in 99% cases
  • Apache Mesos | a cluster-management platform, which abstracts CPU, memory, storage, and other compute resources away from machines (physical or virtual), enabling fault-tolerant and elastic distributed systems to easily be built and run effectively
  • Titus | is a framework on the top of Apache Mesos. It is a container management platform that provides scalable and reliable container execution and cloud-native integration with Amazon AWS. Titus was built internally at Netflix and is used in production to power Netflix streaming, recommendation, and content systems.
  • Nomad | is a highly available, distributed, data-center aware cluster and application scheduler designed to support the modern datacenter with support for long-running services, batch jobs, and more. It makes sense to use it with other products of HashiCorp, because it has outstanding native integration with Consul and Vault to complement the service discovery and secret management
  • Kubernetes (K8s) | is an open-source system for automating deployment, scaling, and management of containerized applications.
  • Service Meshes Pattern | service mesh is an infrastructure layer that handles service-to-service communication, freeing applications from being aware of the complex communication network. The mesh provides advanced capabilities, including encryption, authentication and authorization, routing, monitoring and tracing.
  • Linkerd | is a light service mesh for Kubernetes with observability, reliability, and security without requiring any code changes.
  • Istio | an open source service mesh that layers transparently onto existing distributed applications. It is also a platform, including APIs that let it integrate into any logging platform, or telemetry or policy system.
  • Consul Connect | provides service-to-service connection authorization and encryption using mutual Transport Layer Security (mTLS).
  • SuperGloo | an open-source project to manage and orchestrate service meshes at scale.

Microservice cloud's components:


Configuration

  • Spring Cloud Config | built on top of the Spring Platform, then Spring Cloud Config is the one of the most accessible configuration management options to start with. It provides both server-side and client-side support (the communication is based on HTTP protocol), is exceptionally easy to integrate with and even to embed into existing services.
  • Archaius | yet another excellent product from Netflix family

Service discovery

  • Atomix | provides capabilities for cluster management, communicating across nodes, asynchronous messaging, group membership, leader election, distributed concurrency control, partitioning, replication and state changes coordination in distributed systems.
  • Eureka | a REST-based service that is dedicated to be primarily used for service discovery purposes (with an emphasis on AWS support). It is written purely in Java and includes server and client components
  • Apache Zookeeper | a centralized, highly available service for managing configuration and distributed coordination.
  • Etcd | a distributed, consistent and highly-available key value store, with emphasis to use for service discovery and configuration management.
  • Consul | a distributed, highly available, and data center aware solution for service discovery and configuration.

API gateway

  • Zuul/Zuul2 | serves as the front door for all requests coming to their streaming backends. It is a gateway service (also sometimes called edge service) that provides dynamic routing, monitoring, resiliency, security, and more
  • Spring Cloud Gateway | a library to facilitate building your own API gateways leveraging Spring MVC and Spring WebFlux. The first generation of Spring Cloud Gateway was built on top of Zuul, but the next one has migrated to Spring’s Project Reactor
  • Microgateway | a developer-focused, extensible gateway framework written in Node.js for enforcing access to Microservices and APIs.
  • Gloo | an advanced Kubernetes-native API gateway, the next generation implementation of software from that category. Gloo is exceptional in its function level routing; its support for legacy apps, microservices and serverless; its discovery capabilities; its numerous features; and its tight integration with leading open-source projects.
  • Apache Camel | a collection of base components to build your own api gateway system

Load balancing

  • Nginx | an open source software for web serving, reverse proxying, caching, load balancing of TCP/ HTTP/ UDP traffic (including HTTP/2 and gRPC as well), media streaming, and much more. What makes nginx extremely popular choice is the fact that its capabilities go way beyond just load balancing. In fact, it's often used as a de-factor web-server. Also it can be served as a static API gateway.
  • HAProxy | is a free, very fast, reliable, high performance TCP/ HTTP (including HTTP/2 and gRPC) load balancer. In some sense it can be the 100% replacement for nginx and suitable for the most kinds of deployment environments and workloads. Also (as well as nginx) it can be served as a static API gateway.
  • Traefik | a reverse proxy and load balancer, it exposes metrics, access logs, bundles web UI and REST(ful) web APIs
  • Synapse | a system for service discovery, developed and open sourced by Airbnb.
  • Envoy | a representative of the new generation of edge and service proxies. It supports advanced load balancing features (including retries, circuit breaking, rate limiting, request shadowing, zone local load balancing, etc) and has first class support for HTTP/2 and gRPC.
  • Ribbon | a client-side IPC library with built-in software load balancing. It supports TCP, UDP and HTTP protocols and integrates with Eureka.

Resilience patterns

  • Health check pattern | infrastructure and orchestration layers to probe the service
  • Spring-retry | retrying the request to the service in case of intermittent failures
  • failsafe | offering a range of retry and back-off policies
  • resilience4j | offering a range of retry and back-off policies
  • Bulkhead | minimize the impact of the failures in the applications
  • Rate limiting | a technique to control the rate of requests

Observability:


Metrics

  • Graphite | is an enterprise-ready monitoring tool that runs equally well on cheap hardware or Cloud infrastructure to store, retrieve, share, and visualize timeseries data.
  • OpenTSDB | is a distributed, scalable Time Series Database (TSDB) written on top of HBase addressing the need to store, index and serve metrics collected from computer systems (network gear, operating systems, applications) at a large scale, and make this data easily accessible and graphable.
  • TimescaleDB | yet another time series database optimized for fast ingest and complex queries.
  • Prometheus | is an open-source systems monitoring and alerting toolkit
  • Atlas | used to manage dimensional time series data for near real-time operational insight.

Distributed Tracing

  • OpenZipkin | a distributed tracing system. It helps gather timing data needed to troubleshoot latency problems in service architectures.
  • OpenTracing | is comprised of an API specification, frameworks and libraries that have implemented the specification, and documentation for the project.
  • Brave | is a distributed tracing instrumentation library. Brave typically intercepts production requests to gather timing data, correlate and propagate trace contexts.
  • Jaeger | a distributed tracing system
  • OpenSensus | a set of libraries for various languages that allow you to collect application metrics and distributed traces, then transfer the data to a backend of your choice in real time.
  • Haystack | an Expedia -backed open source distributed tracing project to facilitate detection and remediation of problems in microservices and websites.
  • Apache SkyWalking | an open source observability platform to collect, analyze, aggregate and visualize data from services and cloud native infrastructures

Monitoring

  • Graphana | a visualization tool with its own alert engine and alert rules.
  • Chronograf | a user interface for Kapacitor - a native data processing engine that can process both stream and batch data from InfluxDB (all from TICK Stack)

Alerting

  • Alertmanager | used to handle alerts sent by client applications such as the Prometheus server. It takes care of deduplicating, grouping, and routing them to the correct receiver

Log management

  • ELK | is a distributed, RESTful search and analytics engine to centrally store data
  • Graylog | a leading centralized log management solution built to open standards for capturing, storing, and enabling real-time analysis of terabytes of machine data, built on the top of Elasticsearch and MongoDB.
  • GoAccess | an open source real-time web log analyzer and interactive viewer that runs in a terminal in
  • Apache Flume | a distributed, reliable, and available system for efficiently collecting, aggregating and moving large amounts of log data from many different sources to a centralized data store.

Testing:


Unit Tesing

Integration Tesing

  • Mockito | a java library for functional and internal integration testing
  • Test containers | docker for integration testing
  • Arquillian | a java library for functional testing
  • Spring boot test | integration tests for Spring Boot apps
  • End-to-end testing | designed after the workflows performed by the users, from the beginning to the end
  • Selenium | a framework for UI testing and RPA

Load testing

  • Apache Jmeter | the UI-based approach to create and manage quite sophisticated test plans, designed to load test functional behavior and measure performance.
  • Gatling | a highly capable load testing tool. It is designed for ease of use, maintainability and high performance
  • Apache Bench | a cli a tool for benchmarking HTTP-based services and applications.
  • Locust | an easy-to-use, distributed, scalable load testing framework written in Python

Security testing

  • Sonar Qube | an open source platform to perform automatic reviews with static analysis of code to detect bugs, code smells and security vulnerabilities on 25+ programming languages including Java, C#, JavaScript, TypeScript, C/C++, COBOL and more
  • PMD | static analysis of vulnerabilities in source code
  • Find security bugs | code security audits for Java
  • Zed Attack Proxy | used to automatically find security vulnerabilities in web applications
  • XSStrike | a Cross Site Scripting detection suite equipped with four hand written parsers, an intelligent payload generator, a powerful fuzzing engine and a fast crawler
  • Archery | an opensource vulnerability assessment and management tool which helps developers and pentesters to perform scans and manage vulnerabilities. Archery uses popular opensource tools to perform comprehensive scanning for web application and network.
  • Security Monkey | monitors your AWS and GCP accounts for policy changes and alerts on insecure configurations

Chaos Engineering