A fair, reproducible benchmark comparing RunMQ (RabbitMQ) and BullMQ (Redis) across six performance dimensions.
Transparency notice: This benchmark was created by the RunMQ maintainer. The entire benchmark framework — every line of code, the Dockerfile, the HTML report generator, and this README — was written by Claude (Anthropic). The RunMQ maintainer wrote zero lines of benchmark code and asked Claude to optimize BullMQ to the maximum.
While every effort was made to ensure fairness (equal Docker resources, both libraries tuned for max performance,
addBulk()for BullMQ, documented methodology, multi-run averaging with stddev), the two systems have fundamentally different architectures that make perfect apples-to-apples comparison impossible. See Known Limitations for details.We encourage the community to review the source, run the benchmark themselves, and report any bias.
Tested on MacBook Pro M4 Max, Docker Desktop, mean of 3 runs ± stddev.
| Scenario | RunMQ | BullMQ | Ratio |
|---|---|---|---|
| Publish Throughput | 67,206 ±1,322 msg/s | 53,702 ±400 msg/s | 1.3x faster |
| Consume Throughput | 22,120 ±278 msg/s | 8,513 ±76 msg/s | 2.6x faster |
| E2E Latency (mean) | 0.81 ±0.07 ms | 0.74 ±0.04 ms | BullMQ 1.1x lower |
| E2E Latency (p50) | 0.59 ±0.09 ms | 0.57 ±0.04 ms | BullMQ 1.0x lower |
| E2E Latency (p95) | 1.88 ±0.21 ms | 1.82 ±0.20 ms | BullMQ 1.0x lower |
| E2E Latency (p99) | 2.52 ±0.12 ms | 2.82 ±0.20 ms | RunMQ 1.1x lower |
| 1 Consumer | 9,072 ±35 msg/s | 558 ±6 msg/s | 16.3x faster |
| 2 Consumers | 14,378 ±211 msg/s | 1,101 ±1 msg/s | 13.1x faster |
| 4 Consumers | 21,149 ±234 msg/s | 2,105 ±14 msg/s | 10.0x faster |
| 8 Consumers | 24,552 ±933 msg/s | 3,880 ±35 msg/s | 6.3x faster |
| Publish 100B | 66,422 ±2,982 msg/s | 52,120 ±2,521 msg/s | 1.3x faster |
| Publish 1KB | 55,074 ±2,180 msg/s | 42,408 ±959 msg/s | 1.3x faster |
| Publish 10KB | 27,821 ±658 msg/s | 16,332 ±588 msg/s | 1.7x faster |
| Consume 100B | 21,364 ±823 msg/s | 8,131 ±253 msg/s | 2.6x faster |
| Consume 1KB | 19,417 ±309 msg/s | 7,764 ±51 msg/s | 2.5x faster |
| Consume 10KB | 10,982 ±153 msg/s | 5,311 ±170 msg/s | 2.1x faster |
| Reliability (basic) | 27,580 ±1,240 msg/s | 8,326 ±182 msg/s | 3.3x faster |
| Reliability (retries) | 27,139 ±848 msg/s | 8,282 ±150 msg/s | 3.3x faster |
RunMQ wins on throughput across all scenarios. Latency is competitive — BullMQ leads on mean, p50, and p95 by sub-millisecond margins, RunMQ leads on p99, with all values sub-3ms. RunMQ's usePublisherConfirms: true is on by default, so every publish() awaits a broker ack — matching BullMQ's confirmed-persistence semantics. See Known Limitations for important caveats about what these numbers represent.
# Requires Docker and Docker Compose
./run.shThis builds the Docker images, runs all benchmarks, generates an HTML report at results/report.html, and opens it in your browser.
Pick which scenarios run and how many iterations each gets via env vars or CLI flags.
Scenario keys: publish, consume, e2e, concurrent, sizes, reliability (or all).
# Dockerized runs — env vars are forwarded into the container
SCENARIOS=e2e,publish RUNS=5 ./run.sh
SCENARIOS=reliability ./run-local.sh
# Direct (no Docker) — requires local RabbitMQ + Redis, then build first
npm run build
npm run bench:e2e # single scenario shortcut
npm run bench -- --scenarios=publish,e2e --runs=5
node --expose-gc dist/runner.js --list # list available scenariosPer-scenario npm shortcuts: bench:publish, bench:consume, bench:e2e, bench:concurrent, bench:sizes, bench:reliability. All accept extra flags after --, e.g. npm run bench:e2e -- --runs=10.
docker compose up --build --abort-on-container-exit benchmark
# Report: results/report.html
# Raw data: results/results.json
docker compose downBy default the benchmark image installs runmq from npm. To benchmark a
local checkout of the RunMQ source instead (e.g. an unreleased branch
or local change), use:
./run-local.shThe script:
- Resolves the RunMQ source path — defaults to
../queue(sibling directory). Override withRUNMQ_PATH=/path/to/runmq ./run-local.sh. - Runs
npm install(if needed) andnpm run buildin that directory. - Runs
npm packand drops the resulting tarball atbenchmarks/runmq-local.tgz. - Builds the benchmark image using
Dockerfile.local, which installs the tarball over the npm-publishedrunmqinnode_modules. - Runs the full benchmark suite, opens the HTML report, tears down containers, and removes the tarball.
Manual equivalent (after producing runmq-local.tgz yourself):
# from runmq source dir
npm install && npm run build
npm pack
mv runmq-*.tgz /path/to/benchmarks/runmq-local.tgz
# from benchmarks dir
docker compose -f docker-compose.yml -f docker-compose.local.yml \
up --build --abort-on-container-exit benchmark
docker compose -f docker-compose.yml -f docker-compose.local.yml downNotes:
- The version field of the local build does not need to match the
version requested in
package.json—npm install <tgz>overrides it. - Force a clean rebuild (e.g. after dependency changes) with
docker compose -f docker-compose.yml -f docker-compose.local.yml build --no-cache benchmark.
Each scenario runs 3 times. Results show mean ± standard deviation. Message counts are calibrated per scenario to ensure each test runs for at least 3 seconds, avoiding rate extrapolation from short bursts.
| Scenario | Messages per run | What It Measures |
|---|---|---|
| Publish Throughput | 1,000,000 | Batch publish rate using each library's optimal bulk mechanism |
| Consume Throughput | 100,000 | Messages consumed per second with a single no-op consumer |
| End-to-End Latency | 1,000 | User-observable latency from publish API call to consumer handler (p50/p95/p99) |
| Concurrent Consumers | 50,000 x4 | Throughput scaling at 1, 2, 4, 8 concurrent consumers |
| Message Sizes | 500K (100B), 500K (1KB), 150K pub / 50K consume (10KB) | Impact of payload size on publish and consume |
| Reliability Overhead | 100,000 x2 | Cost of enabling retries (3 attempts, 100ms delay) |
- 3 runs per scenario: All runs are measured and averaged.
- Mean ± stddev: All results report the arithmetic mean and standard deviation across runs.
- No extrapolation: Message counts are sized so each test runs for 3+ seconds at the fastest library's rate, preventing inflated msg/s from sub-second bursts.
- Per-batch payload generation: Payloads are generated in batches of 500 to prevent OOM at high message counts (e.g., 1M publish).
- GC isolation:
global.gc()(double-pass) is forced before every library run across all iterations. - Equal settling time: Both libraries get identical 1000ms sleep before each run.
| Setting | RunMQ | BullMQ |
|---|---|---|
| API | loop of publish() — amqplib TCP-batches automatically |
addBulk() — single Redis pipeline per batch |
| Batch size | 500 (TCP auto-batched) | 500 (explicit addBulk()) |
| Durability | Durable exchange (default) | Default Redis persistence |
| Payload | 100-byte JSON | 100-byte JSON |
| Warmup | 100 messages | 100 messages |
| Setting | RunMQ | BullMQ |
|---|---|---|
| Concurrency | consumersCount: 1 |
concurrency: 1 |
| Stall detection | N/A | Disabled (skipStalledCheck: true) |
| Lock renewal | N/A | Disabled (skipLockRenewal: true) |
| Drain delay | N/A | 1ms (minimum allowed) |
| Job cleanup | Messages acked (removed from queue) | removeOnComplete: true, removeOnFail: true |
| Handler | async () => {} (no-op) |
async () => {} (no-op) |
| Warmup | 100 messages consumed before measurement | 100 messages consumed before measurement |
| Measurement | first-consumed → last-consumed | first-consumed → last-consumed |
| Setting | RunMQ | BullMQ |
|---|---|---|
| Concurrency | consumersCount: 1 |
concurrency: 1 |
| Inter-message delay | 5ms | 5ms |
| Timestamp | performance.now() BEFORE publish() call |
performance.now() BEFORE add() call |
| Measures | buffer write + TCP transit + broker routing + push to consumer | Redis write round-trip + worker BRPOP pickup |
| Setting | RunMQ | BullMQ |
|---|---|---|
| Concurrency levels | 1, 2, 4, 8 | 1, 2, 4, 8 |
| Implementation | N AMQP consumers on N channels | N concurrent processors in 1 worker |
| Simulated work | 1ms async delay per message | 1ms async delay per message |
| Setting | RunMQ | BullMQ |
|---|---|---|
| Payload sizes | 100B, 1KB, 10KB | 100B, 1KB, 10KB |
| Serialization | JSON → Buffer (AMQP body) | JSON → Redis string |
| Publish method | publishBatch() — TCP auto-batched |
addBulk() — Redis pipeline |
| Both | Identical JSON generated by generatePayload() |
Identical JSON generated by generatePayload() |
| Setting | RunMQ | BullMQ |
|---|---|---|
| Basic | No retry config | No retry config |
| With retries | attempts: 3, attemptsDelay: 100 |
attempts: 3, backoff: { type: 'fixed', delay: 100 } |
| Mechanism | Dead-letter exchange + TTL requeue | Redis delayed set |
| Note | No messages intentionally failed | No messages intentionally failed |
- Equal Docker resources: Both RabbitMQ and Redis receive identical limits — 2 CPUs and 4 GB RAM. The benchmark runner gets 2 CPUs and 8 GB.
- No management overhead: RabbitMQ uses the base
rabbitmq:3-alpineimage (no management plugin HTTP server). Redis usesredis:7-alpine. - No host port mapping: Brokers communicate over Docker's internal network only.
Both libraries are tuned for maximum throughput:
RunMQ:
- Default prefetch — allows RabbitMQ to pipeline multiple messages to the consumer without waiting for individual acks.
- Silent logger — eliminates I/O overhead from console logging.
BullMQ:
addBulk()for publishing — single Redis pipeline per batch instead of one round-trip per message. This is BullMQ's recommended high-throughput pattern.skipStalledCheck: true— disables background Redis polling timer.skipLockRenewal: true— disables lock renewal timer (jobs complete in <1ms).removeOnComplete: true/removeOnFail: true— immediate cleanup, reduces Redis memory pressure.drainDelay: 1— minimum idle poll delay (1ms, the lowest BullMQ allows).
- Multi-run averaging: Each scenario runs 3 times. Results are mean ± stddev.
- Sequential runs: Only one library is tested at a time. No resource contention.
- GC before each run:
global.gc()is called (double-pass) before each library's run in every iteration. - Equal settling time: Both libraries get identical 1000ms sleep before each run.
- Consumer warmup: Consume-throughput scenario includes 100-message warmup for both.
- Publish warmup: Publish-throughput scenario includes 100-message warmup for both.
- Unique topics: Each scenario uses unique topic/queue names to prevent stale data.
- Timeouts: All scenarios have 120-second timeouts.
- Identical payloads: The same
generatePayload(sizeBytes)function generates byte-identical JSON for both libraries. - Batch publishing fairness: Both libraries use their optimal bulk mechanism — RunMQ gets TCP auto-batching, BullMQ gets
addBulk()Redis pipelines. - Consume timing: Throughput is measured from first message consumed to last message consumed, excluding the publish phase.
- Latency timing: Sent timestamp is captured BEFORE
publish()/add()is called — the same measurement point for both. This measures user-observable latency: total time from calling the API to the consumer handler firing. Both include their full delivery cost (RunMQ: buffer + transit + routing; BullMQ: Redis write + worker pickup). - Adapter pattern: Both libraries implement the same
QueueAdapterinterface.
These are genuine architectural differences between the two systems:
| Difference | RunMQ (RabbitMQ) | BullMQ (Redis) |
|---|---|---|
| Broker persistence | Durable queues/exchanges by default. Messages survive broker restarts. | In-memory by default. Durability requires AOF/RDB config. |
| Message routing | Exchange → queue binding with routing keys. | Direct list/stream operations. |
| Consumer model | Push-based: broker pushes messages via AMQP channels. | Poll-based: worker polls Redis for new jobs. |
- Higher msg/s = better for throughput scenarios
- Lower ms = better for latency scenarios
- Results show mean ± standard deviation across 3 runs
- The HTML report shows percentage differences in the summary table
- Raw JSON data is saved to
results/results.jsonfor further analysis
To change message counts, edit the TOTAL_MESSAGES constant in each scenario file under src/scenarios/.
To change the number of runs, edit TOTAL_RUNS in src/runner.ts.
Rebuild with:
docker compose build --no-cache benchmark
docker compose up --abort-on-container-exit benchmarkTo change Docker resource limits, edit docker-compose.yml.
These are inherent limitations that cannot be fully resolved due to architectural differences between the two systems:
-
Publish API shape differs, but durability guarantees match. RunMQ ships with
usePublisherConfirms: trueon by default, so everypublish()awaits broker ack — same durability promise as BullMQ'saddBulk()Redis pipeline round-trip. The two APIs differ in shape (RunMQ awaits a Promise per message, BullMQ awaits one per bulk) but not in guarantee. The headline numbers reflect each library's optimal confirmed-persistence path. Opting out of confirms (usePublisherConfirms: false) puts RunMQ in fire-and-forget mode and widens the gap further, but that is not the default and not what is measured here. -
Redis is in-memory, RabbitMQ is disk-backed. RabbitMQ persists messages to durable queues by default. Redis operates entirely in-memory. This fundamentally affects latency comparisons and will always favor BullMQ on E2E latency. This is a deliberate design tradeoff — not a benchmark flaw.
-
Execution ordering is fixed. RunMQ always runs first in every scenario. The first library to run pays a JIT cold-start cost. This slightly disadvantages RunMQ, not BullMQ. Proper mitigation would be to alternate or randomize order across runs.
-
1ms simulated work in concurrent consumers. At 1ms work duration, the per-message fetch overhead is a significant percentage of total processing time, which favors RunMQ's push-based model over BullMQ's poll-based model. With more realistic work durations (10-100ms), the throughput difference between the two would narrow.
-
RunMQ's internal processor chain. RunMQ allocates 6 processor objects per consumed message (deserializer, retries checker, acknowledger, etc.). This overhead is included in RunMQ's consume numbers, making RunMQ look slightly worse than a more optimized implementation could achieve. This is a bias against RunMQ.
If you find additional bias in either direction, please open an issue.
- Docker Engine 20+
- Docker Compose V2
- ~16 GB free memory (4 GB per broker + 8 GB for runner)