Cloud Observability - Ideation and solution architecture

Key Information

The challenge is finished.

Challenge Overview

Challenge Objectives

  • Ideate and design a solution architecture document for a stack that can be used to build an observability service to monitor microservices running in containers across clusters.
  • The output of this document will be a detailed solution design for the cloud observability app.
  • You are free to use existing available tools like Grafana, Prometheus, etc. to help design the solution.

Project Background

The link above is an example application running Kubernetes cluster. Kubernetes uses pods as its workloads and pods are used to run the application containers.

Pods are ephemeral by nature, and the containers being tightly coupled to their logs deprive a developer from viewing the logs when the pods are deleted, re-created, re-started, and moved from one node to another.

A highly dynamic & distributed nature of Cloud native apps spread across different layers (UI, API, Service, DB and infrastructure) makes it difficult for developers to find the root-cause of problems in apps. At times, it is harder to trace a request that passes through multiple layers of Microservice apps fulfilling a client’s needs. Most often-complex problems appear at the intersections of (UI, API, Service, DB etc) that go undetected.

Problem Statement

The client seeks a robust, pluggable solution that enables end-to-end Observability of the Cloud Native Application, by providing visibility into the entire App landscape. As part of the solution, the following features are envisioned and envisioned to be part of the design document.

  • Have a uniform mechanism to assimilate, ingest, collect telemetry data that is needed (metrics, events, logs, and traces) from sources within the landscape (frontend UI to infrastructure layer).
  • Establish a connected context and traceability across different components of the landscape and specific transactions that spread across multiple components.
  • The telemetry data collected should be presented, in a way easier to troubleshoot a problem and use appropriate visualization techniques suitable for large complex distributed systems.
  • Have a unified database, which gathers all the telemetry data in one place, which gives a connected data view of systems. The schema is expected to be scalable and flexible as the business grows.
  • Have configurable alerts and notification mechanisms along with default ones.
  • Apply intelligence to derive patterns, anomalies and log correlation to business context.
  • Automatically apply instrumentation where visibility is needed the most (Nice to have)

Auto-instrumentation allows users to monitor the applications without the need to modify the code base, and immediately start gathering observability data.



You are required to ideate and design a solution architecture document that will contain the information of how this solution can come to life.

The ideal solution will be pluggable. That means, there would be little to no changes required in the microservices that this observability stack will monitor.

A clean solution, using existing tools connected together to become a solid observability/monitoring stack.

Services/layers to log:

  1. UI
  2. API
  3. Service
  4. DB
  5. Infrastructure
  6. Application logs
  7. Instance Boot Logs


  • Good working knowledge of Microservices and Container Orchestration technology
  • Programming work experience
  • Important: Please refer to the following articles for a clear understanding of the architecture you would propose:


  • Tshirt for everyone: every successful submitter on any Skill Builder Competition gets a Topcoder t-shirt.
  • First Time Submitter: 50 First time submitters on any of the SKill Builder Competition get $50 as a bonus
  • Gig Workers Bonus: Gig Worker who have never competed in a Topcoder Challenge, get $50 as a bonus above the mentioned prizes
  • AlefEdge Skill Builder Bonus: If you successfully submit in any Topcoder Skill Builder Competition plus participate succesfully in the AlefEdge Skill Builder Competition, the first 50 members will earn an additional $100 bonus.
  • Multiple Skill Builder Submissions: The more you participate, the more you earn. There are 9 Skill Builder Competitions and we’ll double your September bonus prizes if you successfully submit to all of them

Submission Deliverables

Please submit the zip file containing the following for the initial review:

  • A design document detailing the tools, and processes by which the observability stack can be built
  • An optional video walkthrough for explaining the document. (Please share through your google drive)


Final Review:

Community Review Board


User Sign-Off


ID: 30201037