software engineer, consultant, conference speaker, #tech4good, #stacktivism

Build a Notification System: Part 1 System Design and Features

Notification Systems Design

I've been having a blast piecing ~30 enterprise-cloud services together to support a Python notification system. Expand the diagram to full screen to see the services interacting together including event publishers, blob storage, DNS, common notification service and outbound handler and vendor adapter.

Requirements

  • event-driven system: a lightweight notification of a condition or state change when triggered
  • publisher - one-to-many relationship from Publisher object to Event object retain the source of the notification
  • Event created when Publisher service is triggered
  • EventType - a class that we'll create for our Events with standard properties on all events. regardless of the

Picking your tools?

While deciding what tool to use is highly dependent on what the product needs and the tradeoffs you're willing to make. I'll be using Azure products in this example, but the same requirements will map to other cloud providers.

  • Do you want your data to be streaming?
  • Does the order of events matter?
  • Do you need a queue? pause, rewind or restart?
  • Do you need large volume of data?
  • Do your events rely on client session (id)?

Full System Design Diagram

I've been having fun playing around with https://azurediagrams.com/.

triangle with all three sides equal click to expand
  • Web Content Delivery
  • Blob Data Lake
  • DNS - Here I use Azure DNS, but systems are moving to FrontDoor.
  • DDoS Protection
  • Common Notification Service on App Service
  • Common Outbound handler with Azure Function
  • Notification Adapter with Azure Functions
  • Redis cache and PostgreSQL + Backups
  • Event Publishers: App Service, Container Apps
  • Monitoring: Application Insights
  • Serverless functions
  • WebSockets
  • Load balancer
  • Notification vendors: SMS, email, chat, voice, mobile

Diagram Edits: Written incorrectly in the paragraph above are two sections labeled Common Notification Service. The left of the two should be labeled "Common Outbound handler" which is using Azure Functions. Again, further left from "INGEST" section, in "PROCESS." Azure functions could be labeled "Notification Adapters" because this is where the business logic will live to transform the data to suit each notification vendor.

Triggering the Event Publisher: Client Servers

App Service in this scenario could be a normal Django App, while in this same scenario, the FastAPI container could be a bulk notification service using different business logic to notify the correct accounts. Currently what we have set up does not require real time event processing. However, if we wanted to set up streaming data, a solution could be to use PostgreSQL's notify trigger; which will allow you to send a notification event as a change feed to the listener channel specified in the database.

The tables in the one PostgreSQL database are a bit vague, but this is a generic notification system. The table "Business Logic" is doing a fair bit of heavy lifting. This is where managing the notification and audience will live and the notification tracker.

Event Data Ingestion

Figuring out the event ingestion service will be dependent on the user experience you're looking to achieve.

Event Bus versus Event Hub versus Event Grid versus Web PubSub
Service Pros Cons
Event Hub Processes large volume of events and data with low latency and high reliability. Provides a namespace for a management container with a DNS integrated network similar to Kafka. Includes IP filtering, virtual networks endpoints and private link, Blob storage and partitioning. Forward only "ingestion" stream which means there is not a lot of "broker features"
Event Grid Use when you're looking for event hub, service bus, storage queues, connections, etc. Provides event subscriptions by HTTP with event ingestion and also a response trigger and other "broker features." Complete routing service and can connect to any application you create. Both Event Grid and Web PubSub describe themselves as "fully managed publish-subscribe" messaging services. Setting up monitoring for each part of your services will be essential in debugging. Finding where there are issues will come from your Log Analytics Workspace.
Web PubSub Simple short form Publish Subscribe, with out much extra bells and whistles. These are WebSockets, reminiscent of Azure SignalR without the need for SignalR client. It still aims to manage real-time communication with app server. The app server remains HTTP only. Simple Publish, Subscribe.

Failure in these event could be intentionally silent. If a missing event object is not observed either through a bi-directional websocket or a singular direction webhook.

Monitoring

We have analytics and monitoring at the bottom of the diagram, which will give user analytics, actions and notification analytics. The user story and designing the notifications gets fun as well, as we define more of the business logic related to the domain. Owner's will receive notification of approver's decisions, and owner and approver will receive notification on all changes. Subscribers will see only notifications at the start and end of the approval. Consistent wording, localization, verbose naming, variables and brevity without sacrificing clarity.

Next up: Django Notification System & App Service, Azure Functions, Notification Vendors, and Monitoring.