message queueing vs event streaming

Message Queuing vs. Event Streaming: Key Differences and Use Cases

Nowadays, applications rarely operate alone - they need to communicate, share data, and collaborate with other systems to deliver comprehensive and efficient services. This has led to the development of various communication paradigms throughout the years, each designed to address specific integration needs, as well as various use cases. The most common integration options are file transfer, shared database, connectors, APIs, and messaging.

Choosing the right communication paradigm is essential for modern software architectures. In this article, we will focus on messaging and its subtypes (message queues and event streaming).

Message queuing and event streaming each have their strengths: message queuing is ideal for reliable, ordered message delivery and complex routing, while event streaming excels in high-throughput, low-latency data processing. However, they can also be combined to suit your unique use case. Let’s explore the distinct characteristics, advantages, and potential use cases for each of them in more detail.

How Messaging/Events Integrate in Modern Software Architectures

The microservices architecture is a popular way to build large-scale, complex apps. They structure an application as a collection of loosely-coupled, independently deployable services, each handling a specific business function. This modular approach allows for better scalability, flexibility, and maintainability.

A critical aspect of microservices is how they integrate with each other. Ideally, communication between internal microservices must be as little as possible for increased efficiency.
There are two types of communication paradigms: synchronous and asynchronous.

Relying on synchronous HTTP connections between microservices, like in the long request/response cycles shown in image below, makes the microservices less independent and slows down performance. Synchronous communication between microservices forms a "chain" of requests to fulfill a client's request, and is considered an anti-pattern. If one service in the chain has problems, it affects the whole system's performance.

A good practice is to adopt asynchronous communication (pictured below), which allows microservices to interact using asynchronous messages or HTTP polling, so that the client request is served right away.

Messaging

A Little Bit of History


Messaging is the process of sending and receiving data asynchronously between different parts of an application or between different services, usually via an intermediate component called broker.

concept of messaging dates back to the 1970s and 1980s with the advent of systems like IBM's Customer Information Control System (CICS) and MQSeries (later IBM MQ), which facilitated asynchronous communication between systems. In 1998, the Java Message Service (JMS) was introduced as part of the Java 2 Platform, Enterprise Edition (J2EE). JMS provided a standardized API for Java applications to create, send, receive, and read messages from message-oriented middleware systems. While JMS was beneficial for Java applications, it was not designed for interoperability with non-Java applications.

In the following years, protocols such as AMQP, MQTT and STOMP were developed to provide more flexible and lightweight options, incorporating similar concepts from JMS. Brokers like ActiveMQ and RabbitMQ were created to implement these protocols. In 2011, LinkedIn developed Apache Kafka, an event streaming platform which has become one of the most well-known brokers in recent years.

Types of Messages

A message is a generic term used to describe a packet of data sent from one component to another and it’s usually composed of two parts, a header and a payload. There are different types of messages, including:

  • command message - instructs another system or component to perform a specific action. It is imperative, meaning it directs the receiver to carry out a task, such as processing an order or updating a record.
  • document message - used to transfer structured data between systems. It encapsulates information, such as a user profile or transaction details, ensuring both the sender and receiver have the same data for processing or storage.
  • event message - informs other systems about an occurrence or change in state within the system. It is descriptive, providing details about an event, like a status update or a detected change, enabling systems to react accordingly.

Message Channels

When one application needs to send data, it doesn't simply throw the data into the messaging system. Instead, it sends the data to a specific Message Channel. Similarly, an application that is waiting to receive data doesn't just randomly grab data from the messaging system. It retrieves data from a specific Message Channel.

The sending application (producer) doesn't necessarily need to know which particular application will ultimately receive the data. However, what is certain is that the application that receives the data (consumer) will find it relevant and useful.

Now let’s see which message channel patterns are most commonly used.

Channel Patterns

Point-To-Point Pattern - Queue

The point-to-point messaging pattern involves sending messages from a producer to a specific consumer via a queue, ensuring that each message is consumed by only one receiver.

Properties:

  • Each message is consumed by only one consumer.
  • Sender and receiver operate independently, allowing asynchronous communication.
  • Messages are typically processed in the order they are sent (FIFO).
  • Messages are stored in the queue until successfully processed, ensuring they are not lost if the receiver is unavailable.
  • Multiple consumers can read from the same queue to balance the load, with each message processed by only one consumer.

Publish-Subscribe Pattern - Topics


The publish-subscribe messaging pattern involves sending messages from a publisher to multiple subscribers through topics or channels, allowing messages to be received by all interested subscribers.

Properties:

  • Each message is delivered to all subscribers interested in the topic.
  • Publishers and subscribers operate independently, allowing asynchronous communication.
  • Messages are categorized into topics or channels, and subscribers receive messages based on their subscription to specific topics.
  • Easily scales to accommodate many publishers and subscribers.
  • Messages are broadcast to multiple consumers, enabling broad dissemination of information.

Now that we have explored some general aspects of messaging, let's take a look at the most common messaging tools, along with an overview of their differences in terms of types, protocols, and the patterns they implement.

Technology

Type

Point-To-Point (Queue)

Publish-Subscribe (Topics)

Protocols

Apache ActiveMQ

Message Broker

?

?

AMQP, MQTT, OpenWire (native), STOMP etc.

Apache RabbitMQ

Message Broker

?

?

AMQP, MQTT, STOMP etc.

Eclipse Mosquitto

MQTT Broker

?

?

MQTT

HiveMQ

MQTT Broker

?

?

MQTT

Apache Kafka

Event Streaming

?

?

custom binary protocol

Amazon SQS

Message Queue

?

?

HTTP/HTTPS, AWS SDK

Amazon SNS

Notification Service

?

?

HTTP/HTTPS, AWS SDK

Google Cloud Pub/Sub

Event Streaming

?

?

HTTP/HTTPS, gRPC

Azure Service Bus

Message Broker

?

?

AMQP, HTTP/HTTPS

IBM MQ

Message Broker

?

?

MQTT, AMQP, IBM MQ protocol

Redis Pub/Sub

Pub/Sub

System

?

?

Redis protocol

Azure EventHub

Event Streaming

?

?

AMQP, HTTP/HTTPS, Kafka protocol

 

We can see that many of these technologies share similarities and common features. Some implement only queues, others only topics, and some support both. Many offer the flexibility of multiple protocols, while others rely on their own custom protocols.

Additionally, they can be categorized into several types:

  • Traditional Message Brokers
  • Event Streaming Platforms
  • MQTT Broker
  • Notification Brokers


The market offers plenty of solutions, all with different features and strengths. This brings us to a highly debated question in the software engineering community: which one should I use?

The answer depends on various factors, such as specific use cases, performance requirements, scalability needs, and integration capabilities. By understanding the architectural distinctions and how they align with your project's demands, you can make an informed choice that best suits your needs.

A common dilemma arises when deciding whether to use a traditional message broker or an event streaming platform so let’s take a closer look at them.

Message Queuing vs Event Streaming


We will analyze and see the main differences between Traditional Message Brokers and Event Streaming Platforms. For this, let’s compare the internal architectures of two of the most used message brokers in the industry: RabbitMQ (Traditional Message Broker) and Apache Kafka (Event Streaming Platform).

RabbitMQ - Traditional Message Broker

At its core, RabbitMQ uses exchanges to route messages from producers to queues, which then store the messages until consumers retrieve them. There are various types of exchanges, such as direct, topic, fanout, and headers, each defining different routing rules.

Producers send messages to exchanges, which then distribute the messages to queues based on the binding rules set between exchanges and queues. Consumers connect to queues to receive and process messages. This interaction happens over TCP connections, which can support multiple virtual connections known as channels.

RabbitMQ supports clustering, allowing multiple nodes to work together for high availability and scalability.

Apache Kafka - Event Streaming Platform

The architecture of a streaming platform, such as Kafka, is completely different. Kafka is built around the concept of a distributed, partitioned, and replicated log service. Producers send records (messages) to topics, which are split into partitions. Each partition is an ordered, immutable sequence of records, and new records are appended to the end of the partition. This partitioning allows Kafka to scale horizontally, as different partitions can be distributed across multiple Kafka brokers, enabling parallel processing and high throughput.

Kafka brokers are the servers that store the data and serve client requests. Each broker handles a subset of the partitions for each topic, and partitions are replicated across multiple brokers to ensure fault tolerance and high availability. If a broker fails, another broker that holds a replica of the partition can take over, ensuring that data is not lost and processing continues seamlessly.

Consumers in Kafka read records from partitions. Kafka employs a consumer group mechanism, where each consumer in a group reads from a subset of partitions, allowing for load balancing and coordinated consumption of data. This ensures that each record is processed by only one consumer within the group, but the work is distributed among all consumers in the group.

For a better understanding of Kafka, check out our Intro to Kafka article, which provides a great starting point.

Comparison of Apache Kafka and RabbitMQ

Let's take a closer look at the key differences and similarities between Apache Kafka and RabbitMQ:

Feature

Apache Kafka

RabbitMQ

Protocols

Custom binary protocol

AMQP, MQTT, STOMP etc.

Message Ordering

Guaranteed within a partition

Guaranteed within a queue

Performance

Very high throughput, optimized for large-scale streaming

High throughput, optimized for individual message delivery

Delivery Semantics

At least once (default), at most once, exactly once (with configurations)

At least once

Dead Letter Queues

Supported via additional configurations or third-party tools (Kafka Connect)

Natively supported

Message Consumption (Push/Pull)

Pull-based consumption

Push-based consumption

Scalability

Horizontally scalable with partitions

Horizontally scalable with queues and clustering

Message Retention

Configurable retention periods, supports long-term storage

Messages are deleted once consumed unless configured otherwise

Persistence

Log-based storage, highly durable

Durable queues and messages with configurations

Consumer Groups

Yes, with automatic load balancing across consumers in a group

Basic load balancing through competing consumers

Replication

Configurable replication factor per topic

Mirrored queues with manual configurations

Built-In Stream Processing

Yes ( Kafka Streams )

No

Use Cases for Message Brokers and Event Streaming

Traditional Message brokers are a good fit for:

  • Real-time applications (point-to-point messaging) - Situations where messages need to be reliably sent from one point to another, such as in real-time chat applications or control systems in games.
  • Application/Services exchanging data with simple clients - Suitable for microservices architecture where services need to exchange data or commands. Examples include order processing systems or inventory management.
  • Message traffic that makes extensive use of routing rules - Scenarios requiring messages to be routed based on content or specific criteria, such as directing customer service requests to the right department or balancing load among servers.
  • Control over messages (e.g. message expiry or message delay) - Provides fine-grained control over message delivery, including setting TTL (time-to-live) for expiring messages, delaying message delivery for scheduled tasks, or prioritizing messages.
  • Task queuing - Distributing tasks among worker nodes in a reliable manner, ensuring tasks are processed even if workers go down. Common in background job processing, batch processing, and distributed task execution.

Event streaming platforms are well-suited for the following use cases:

  • Real-time data pipelines and streaming applications - High-throughput data streams, enabling real-time data processing and analytics. Suitable for financial market data processing, social media feed analysis, or IoT device data streams.
  • Big data offloading to data warehouse and data lakes - Large volumes of data ingestion from various sources into data storage solutions like Hadoop, Amazon S3, or Google BigQuery for further analysis and storage.
  • High-throughput, low-latency data processing tasks - Processing massive amounts of data quickly with minimal latency, supporting use cases like fraud detection, recommendation systems, and real-time analytics dashboards.
  • Log aggregation and monitoring - Aggregation of logs from various services and systems, facilitating easier monitoring, search, and analysis of application behavior and performance. Common in centralized logging solutions, application performance monitoring, and security event monitoring.
  • Event sourcing and state tracking - Capture of changes in application state as a sequence of events, facilitating the reconstruction of system state from event logs. Useful in domains like financial transactions, order fulfillment systems, and user activity tracking.
  • Website Activity Tracking - Track of user interactions and activities on websites in real-time, providing insights into user behavior, engagement metrics, and conversion rates. Supports use cases like clickstream analysis, user session tracking, and personalized marketing.

Conclusion

Communication paradigms can make or break your applications so choosing the right communication paradigm is crucial. While message queuing and event streaming each have their strengths, keep in mind that these technologies can also overlap and/or complement each other. For instance, RabbitMQ can handle immediate task distribution while Kafka manages real-time data analytics.

Understanding the specific needs of your application will guide you in selecting the right tool or combination of tools to build a robust and scalable system, and an architecture that meets your operational and performance requirements effectively.

 

About the Author 

mircea malaescu esolutions

Mircea Malaescu is an Azure-certified Data Engineer with over six years of experience in big data technologies, including Hadoop, Cassandra, NiFi, Kafka, and Spark. He has extensive experience in DevOps engineering practices and has handled projects in the exciting retail and automotive industries. Mircea is currently working on a streaming project based on Kafka and Azure.