Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,4 @@ jobs:
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: engine/target/site/jacoco/jacoco.xml
12 changes: 9 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,12 +1,18 @@
target/
*.iml

# Eclipse / JDT.LS metadata
.classpath
.project
.settings/
.DS_Store
*.class
.omo/
src/test/resources/test_integration_output/
src/test/resources/test_sample_output/
src/test/resources/test_integration_queries/
engine/src/test/resources/test_integration_output/
engine/src/test/resources/test_sample_output/
engine/src/test/resources/test_integration_queries/

# local working files
dev/
CLAUDE.md
REPORT_*.md
229 changes: 84 additions & 145 deletions AGENTS.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion BENCHMARK_RESULTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Overview

This document contains quantifiable performance results from the query optimization benchmark suite added to BlazeDB. The benchmarks measure tuple processing reduction achieved by rule-based query optimizations.
This document contains quantifiable performance results from the query optimization benchmark suite added to CuckooDB. The benchmarks measure tuple processing reduction achieved by rule-based query optimizations.

## Benchmark Methodology

Expand Down
72 changes: 43 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Java Query Engine
# cuckooDB

![CI](https://github.com/JinBa1/java-query-engine/actions/workflows/ci.yml/badge.svg)
![Coverage](https://codecov.io/gh/JinBa1/java-query-engine/branch/main/graph/badge.svg)
Expand Down Expand Up @@ -83,15 +83,15 @@ The focus is on demonstrating query planning, optimisation, and the Volcano iter
git clone https://github.com/JinBa1/java-query-engine.git
cd java-query-engine

# Build fat JAR
./mvnw clean compile assembly:single
# Build fat JAR (engine module)
./mvnw -pl engine -DskipTests clean package
```

**Run a query:**

```bash
java -cp target/java-query-engine-1.0.0-jar-with-dependencies.jar \
com.github.jinba1.blazedb.BlazeDB \
java -cp engine/target/cuckoodb-engine-1.0.0-jar-with-dependencies.jar \
com.github.jinba1.cuckoodb.CuckooDB \
database_dir input_file output_file [--max-tuples=N] [--timeout-ms=N]
```

Expand Down Expand Up @@ -159,10 +159,10 @@ The engine supports two join algorithms; the planner selects between them automa

### Benchmarks

Performance was measured with a JMH 1.37 benchmark suite in the `bench/` package (`src/test/java/com/github/jinba1/blazedb/bench/`). The suite is compiled in CI but never run there; run it locally with:
Performance was measured with a JMH 1.37 benchmark suite in the `bench/` package (`engine/src/test/java/com/github/jinba1/cuckoodb/bench/`). The suite is compiled in CI but never run there; run it locally with:

```bash
./mvnw -q test-compile exec:exec -Dexec.executable=java -Dexec.classpathScope=test \
./mvnw -pl engine -q test-compile exec:exec -Dexec.executable=java -Dexec.classpathScope=test \
"-Dexec.args=-cp %classpath org.openjdk.jmh.Main .*Benchmark"
```

Expand All @@ -187,7 +187,7 @@ Benchmarks are compiled in CI but never executed there.

## Demo

**Input table** (`samples/db/data/Student.csv`):
**Input table** (`engine/samples/db/data/Student.csv`):

```
A, B, C, D
Expand All @@ -199,7 +199,7 @@ A, B, C, D
6, 300, 400, 11
```

**Query** (`samples/input/query4.sql`):
**Query** (`engine/samples/input/query4.sql`):

```sql
SELECT * FROM Student WHERE Student.A < 3;
Expand All @@ -208,17 +208,17 @@ SELECT * FROM Student WHERE Student.A < 3;
**Command:**

```bash
java -cp target/java-query-engine-1.0.0-jar-with-dependencies.jar \
com.github.jinba1.blazedb.BlazeDB \
samples/db samples/input/query4.sql output.csv
java -cp engine/target/cuckoodb-engine-1.0.0-jar-with-dependencies.jar \
com.github.jinba1.cuckoodb.CuckooDB \
engine/samples/db engine/samples/input/query4.sql output.csv
```

To limit resource usage, add optional budget flags:

```bash
java -cp target/java-query-engine-1.0.0-jar-with-dependencies.jar \
com.github.jinba1.blazedb.BlazeDB \
samples/db samples/input/query4.sql output.csv --max-tuples=10000 --timeout-ms=5000
java -cp engine/target/cuckoodb-engine-1.0.0-jar-with-dependencies.jar \
com.github.jinba1.cuckoodb.CuckooDB \
engine/samples/db engine/samples/input/query4.sql output.csv --max-tuples=10000 --timeout-ms=5000
```

**Output** (`output.csv`):
Expand All @@ -231,15 +231,24 @@ a,b,c,d

## Running Examples

The `samples/` directory ships with 20 queries and a small dataset (Student, Course, Enrolled, Staff tables). Expected output lives in `samples/expected_output/`.
The `engine/samples/` directory ships with 20 queries and a small dataset (Student, Course, Enrolled, Staff tables). Expected output lives in `engine/samples/expected_output/`.

Run all 20 through the bundled runner, which diffs each result against the expected output and reports pass/fail. It is launched via `exec:exec` (not `exec:java`) so it runs with the engine module as the working directory — `exec:java` would keep the working directory at the reactor root and fail to find `samples/`:

```bash
./mvnw -pl engine -q test-compile exec:exec -Dexec.executable=java -Dexec.classpathScope=test \
"-Dexec.args=-cp %classpath com.github.jinba1.cuckoodb.SampleQueryRunner"
```

Or run each query through the CLI and diff manually:

```bash
# Run all sample queries and diff against expected output
for i in $(seq 1 20); do
java -cp target/java-query-engine-1.0.0-jar-with-dependencies.jar \
com.github.jinba1.blazedb.BlazeDB \
samples/db "samples/input/query${i}.sql" "/tmp/out${i}.csv"
diff "samples/expected_output/query${i}.csv" "/tmp/out${i}.csv" && echo "query${i}: OK"
java -cp engine/target/cuckoodb-engine-1.0.0-jar-with-dependencies.jar \
com.github.jinba1.cuckoodb.CuckooDB \
engine/samples/db "engine/samples/input/query${i}.sql" "/tmp/out${i}.csv"
diff "engine/samples/expected_output/query${i}.csv" "/tmp/out${i}.csv" && echo "query${i}: OK"
done
```

Expand All @@ -254,15 +263,20 @@ The test suite covers individual operators, the query planner, the optimiser, ex
## Project Structure

```
├── src/main/java/com/github/jinba1/blazedb/ # Core engine (35 files)
│ └── operator/ # Volcano operators (11 files, incl. HashJoinOperator)
├── src/test/java/com/github/jinba1/blazedb/ # JUnit 5 tests (339 tests across 33 files)
├── samples/
│ ├── db/data/ # CSV data files (header row + data rows)
│ ├── input/query[1-20].sql # Sample queries
│ └── expected_output/query[1-20].csv # Expected results
├── pom.xml # Maven config (Java 17, JSqlParser 4.7, commons-csv 1.14.1, JMH 1.37 test-scope)
├── mvnw / mvnw.cmd # Maven Wrapper
├── pom.xml # Parent POM (aggregator: engine + server; Java 17, dep/plugin management)
├── engine/ # Pure query engine — zero Spring dependencies
│ ├── pom.xml # cuckoodb-engine (JSqlParser 4.7, commons-csv 1.14.1, JMH 1.37 test-scope)
│ ├── src/main/java/com/github/jinba1/cuckoodb/ # Core engine (35 files)
│ │ └── operator/ # Volcano operators (11 files, incl. HashJoinOperator)
│ ├── src/test/java/com/github/jinba1/cuckoodb/ # JUnit 5 tests (339 tests across 33 files)
│ └── samples/
│ ├── db/data/ # CSV data files (header row + data rows)
│ ├── input/query[1-20].sql # Sample queries
│ └── expected_output/query[1-20].csv # Expected results
├── server/ # REST gateway skeleton — Spring Boot REST gateway planned
│ ├── pom.xml # cuckoodb-server (depends on cuckoodb-engine; no Spring yet)
│ └── src/main/java/com/github/jinba1/cuckoodb/server/
├── mvnw / mvnw.cmd # Maven Wrapper
└── LICENSE
```

Expand Down
123 changes: 123 additions & 0 deletions engine/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
<?xml version="1.0" encoding="UTF-8"?>

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>

<parent>
<groupId>com.github.jinba1</groupId>
<artifactId>cuckoodb-parent</artifactId>
<version>1.0.0</version>
<relativePath>../pom.xml</relativePath>
</parent>

<artifactId>cuckoodb-engine</artifactId>
<packaging>jar</packaging>

<name>cuckooDB Engine</name>

<dependencies>
<dependency>
<groupId>com.github.jsqlparser</groupId>
<artifactId>jsqlparser</artifactId>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-csv</artifactId>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-api</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.junit.jupiter</groupId>
<artifactId>junit-jupiter-engine</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<scope>test</scope>
</dependency>
</dependencies>

<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<testAnnotationProcessorPaths>
<path>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-generator-annprocess</artifactId>
<version>${jmh.version}</version>
</path>
</testAnnotationProcessorPaths>
</configuration>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<executions>
<execution>
<id>run-cuckoodb</id>
<goals>
<goal>java</goal>
</goals>
<configuration>
<mainClass>com.github.jinba1.cuckoodb.CuckooDB</mainClass>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<archive>
<manifest>
<mainClass>com.github.jinba1.cuckoodb.CuckooDB</mainClass>
</manifest>
</archive>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.jacoco</groupId>
<artifactId>jacoco-maven-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>prepare-agent</goal>
</goals>
</execution>
<execution>
<id>report</id>
<phase>test</phase>
<goals>
<goal>report</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package com.github.jinba1.blazedb;
package com.github.jinba1.cuckoodb;

import net.sf.jsqlparser.expression.Expression;

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package com.github.jinba1.blazedb;
package com.github.jinba1.cuckoodb;

/**
* The aggregate functions supported in SELECT lists.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package com.github.jinba1.blazedb;
package com.github.jinba1.cuckoodb;

import net.sf.jsqlparser.expression.ExpressionVisitorAdapter;
import net.sf.jsqlparser.schema.Column;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package com.github.jinba1.blazedb;
package com.github.jinba1.cuckoodb;

import net.sf.jsqlparser.schema.Column;

Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package com.github.jinba1.blazedb;
package com.github.jinba1.cuckoodb;

/** The data type of a table column. */
public enum ColumnType { INT, STRING }
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package com.github.jinba1.blazedb;
package com.github.jinba1.cuckoodb;

import net.sf.jsqlparser.expression.BinaryExpression;
import net.sf.jsqlparser.expression.Expression;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
package com.github.jinba1.blazedb;
package com.github.jinba1.cuckoodb;

/**
* Defines global constants used throughout the BlazeDB system.
* Defines global constants used throughout the CuckooDB system.
* This class contains application-wide constants to ensure consistency
* and reduce duplication across the codebase. These include file and directory names,
* schema prefixes, and other string literals used for database operations.
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package com.github.jinba1.blazedb;
package com.github.jinba1.cuckoodb;

import java.io.*;
import java.nio.file.Files;
Expand All @@ -9,14 +9,14 @@
import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVPrinter;

import com.github.jinba1.blazedb.operator.LimitOperator;
import com.github.jinba1.blazedb.operator.Operator;
import com.github.jinba1.cuckoodb.operator.LimitOperator;
import com.github.jinba1.cuckoodb.operator.Operator;

/**
* Lightweight in-memory relational query engine.
* CLI interface: database_dir input_file output_file [--max-tuples=N] [--timeout-ms=N]
*/
public class BlazeDB {
public class CuckooDB {
public static void main(String[] args) {
int code = run(args);
if (code != 0) {
Expand Down Expand Up @@ -99,7 +99,7 @@ static int run(String[] args) {

private static void usage() {
System.err.println(
"Usage: BlazeDB database_dir input_file output_file [--max-tuples=N] [--timeout-ms=N]");
"Usage: CuckooDB database_dir input_file output_file [--max-tuples=N] [--timeout-ms=N]");
}

/**
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
package com.github.jinba1.blazedb;
package com.github.jinba1.cuckoodb;

import org.apache.commons.csv.CSVFormat;
import org.apache.commons.csv.CSVParser;
Expand All @@ -13,7 +13,7 @@
import java.util.stream.Stream;

/**
* The DBCatalog class serves as a central repository for durable database metadata in BlazeDB.
* The DBCatalog class serves as a central repository for durable database metadata in CuckooDB.
* It implements the singleton pattern to ensure a single, consistent view of database structure
* across all components of the system.
* This class maintains information about:
Expand Down
Loading
Loading