8 items tagged “kafka”
2024
Build your own SQS or Kafka with Postgres (via) Anthony Accomazzo works on Sequin, an open source "message stream" (similar to Kafka) written in Elixir and Go on top of PostgreSQL.
This detailed article describes how you can implement message queue patterns on PostgreSQL from scratch, including this neat example using a CTE, returning
and for update skip locked
to retrieve $1
messages from the messages
table and simultaneously mark them with not_visible_until
set to $2
in order to "lock" them for processing by a client:
with available_messages as (
select seq
from messages
where not_visible_until is null
or (not_visible_until <= now())
order by inserted_at
limit $1
for update skip locked
)
update messages m
set
not_visible_until = $2,
deliver_count = deliver_count + 1,
last_delivered_at = now(),
updated_at = now()
from available_messages am
where m.seq = am.seq
returning m.seq, m.data;
2021
RabbitMQ Streams Overview. New in RabbitMQ 3.9: streams are a persisted, replicated append-only log with non-destructive consuming semantics. Sounds like it fits the same hole as Kafka and Redis Streams, an extremely useful pattern.
2020
Get Started—Materialize. Materialize is a really interesting new database—“a streaming SQL materialized view engine”. It builds materialized views on top of streaming data sources (such as Kafka)—you define the view using a SQL query, then it figures out how to keep that view up-to-date automatically as new data streams in. It speaks the PostgreSQL protocol so you can talk to it using the psql tool or any PostgreSQL client library. The “get started” guide is particularly impressive: it uses a curl stream of the Wikipedia recent changes API, parsed using a regular expression. And it’s written in Rust, so installing it is as easy as downloading and executing a single binary (though I used Homebrew).
2018
Introduction to Redis Streams. Redis 5.0 is out, introducing the first new Redis data type in several years: streams, a Kafka-like mechanism for implementing a replayable event stream that can be read by many different subscribers.
Mozilla Telemetry: In-depth Data Pipeline (via) Detailed behind-the-scenes look at an extremely sophisticated big data telemetry processing system built using open source tools. Some of this is unsurprising (S3 for storage, Spark and Kafka for streams) but the details are fascinating. They use a custom nginx module for the ingestion endpoint and have a “tee” server written in Lua and OpenResty which lets them route some traffic to alternative backend.
Notes on Kafka in Python. Useful review by Matthew Rocklin of the three main open source Python Kafka client libraries as of October 2017.
2017
Redis streams aren’t exciting for their innovativeness, but rather than they bring building a unified log architecture within reach of a small and/or inexpensive app. Kafka is infamously difficult to configure and get running, and is expensive to operate once you do. [...] Redis on the other hand is probably already in your stack.
Streams: a new general purpose data structure in Redis. Exciting new Redis feature inspired by Kafka: redis streams, which allow you to construct an efficient, in-memory list of messages (similar to a Kafka log) which clients can read sections of or block against and await real-time delivery of new messages. As expected from Salvatore the API design is clean, obvious and covers a wide range of exciting use-cases. Planned for release with Redis 4 by the end of the year!