Simon Willison’s Weblog

Subscribe
Atom feed for kafka

8 items tagged “kafka”

2024

Build your own SQS or Kafka with Postgres (via) Anthony Accomazzo works on Sequin, an open source "message stream" (similar to Kafka) written in Elixir and Go on top of PostgreSQL.

This detailed article describes how you can implement message queue patterns on PostgreSQL from scratch, including this neat example using a CTE, returning and for update skip locked to retrieve $1 messages from the messages table and simultaneously mark them with not_visible_until set to $2 in order to "lock" them for processing by a client:

with available_messages as (
  select seq
  from messages
  where not_visible_until is null
    or (not_visible_until <= now())
  order by inserted_at
  limit $1
  for update skip locked
)
update messages m
set
  not_visible_until = $2,
  deliver_count = deliver_count + 1,
  last_delivered_at = now(),
  updated_at = now()
from available_messages am
where m.seq = am.seq
returning m.seq, m.data;

# 31st July 2024, 5:34 pm / message-queues, postgresql, sql, sqs, kafka

2021

RabbitMQ Streams Overview. New in RabbitMQ 3.9: streams are a persisted, replicated append-only log with non-destructive consuming semantics. Sounds like it fits the same hole as Kafka and Redis Streams, an extremely useful pattern.

# 13th July 2021, 11:29 pm / message-queues, rabbitmq, redis, kafka

2020

Get Started—Materialize. Materialize is a really interesting new database—“a streaming SQL materialized view engine”. It builds materialized views on top of streaming data sources (such as Kafka)—you define the view using a SQL query, then it figures out how to keep that view up-to-date automatically as new data streams in. It speaks the PostgreSQL protocol so you can talk to it using the psql tool or any PostgreSQL client library. The “get started” guide is particularly impressive: it uses a curl stream of the Wikipedia recent changes API, parsed using a regular expression. And it’s written in Rust, so installing it is as easy as downloading and executing a single binary (though I used Homebrew).

# 1st June 2020, 10:11 pm / databases, postgresql, sql, kafka, rust

2018

Introduction to Redis Streams. Redis 5.0 is out, introducing the first new Redis data type in several years: streams, a Kafka-like mechanism for implementing a replayable event stream that can be read by many different subscribers.

# 18th October 2018, 8:35 am / redis, kafka

Mozilla Telemetry: In-depth Data Pipeline (via) Detailed behind-the-scenes look at an extremely sophisticated big data telemetry processing system built using open source tools. Some of this is unsurprising (S3 for storage, Spark and Kafka for streams) but the details are fascinating. They use a custom nginx module for the ingestion endpoint and have a “tee” server written in Lua and OpenResty which lets them route some traffic to alternative backend.

# 12th April 2018, 3:44 pm / analytics, lua, mozilla, nginx, big-data, kafka

Notes on Kafka in Python. Useful review by Matthew Rocklin of the three main open source Python Kafka client libraries as of October 2017.

# 13th January 2018, 7:40 pm / python, kafka

2017

Redis streams aren’t exciting for their innovativeness, but rather than they bring building a unified log architecture within reach of a small and/or inexpensive app. Kafka is infamously difficult to configure and get running, and is expensive to operate once you do. [...] Redis on the other hand is probably already in your stack.

Brandur Leach

# 8th November 2017, 4:23 pm / redis, kafka, brandur-leach

Streams: a new general purpose data structure in Redis. Exciting new Redis feature inspired by Kafka: redis streams, which allow you to construct an efficient, in-memory list of messages (similar to a Kafka log) which clients can read sections of or block against and await real-time delivery of new messages. As expected from Salvatore the API design is clean, obvious and covers a wide range of exciting use-cases. Planned for release with Redis 4 by the end of the year!

# 3rd October 2017, 3:25 pm / redis, salvatore-sanfilippo, kafka