Skip to content

feat(sql): add arg_min() and arg_max() aggregate functions#6652

Merged
bluestreak01 merged 1 commit intomasterfrom
vi_arg_mm
Jan 15, 2026
Merged

feat(sql): add arg_min() and arg_max() aggregate functions#6652
bluestreak01 merged 1 commit intomasterfrom
vi_arg_mm

Conversation

@bluestreak01
Copy link
Member

Summary

Implements arg_min(value, key) and arg_max(value, key) aggregate functions that return the value of the first argument at the minimum/maximum value of the second argument. These are commonly used analytics functions for finding values at extreme points.

Supported type combinations (18 total):

Function Value Type Key Type Signature
arg_max double double arg_max(DD)
arg_max double timestamp arg_max(DN)
arg_max timestamp double arg_max(ND)
arg_max long double arg_max(LD)
arg_max double long arg_max(DL)
arg_max long timestamp arg_max(LN)
arg_max timestamp long arg_max(NL)
arg_max uuid timestamp arg_max(ZN)
arg_max timestamp uuid arg_max(NZ)
arg_min double double arg_min(DD)
arg_min double timestamp arg_min(DN)
arg_min timestamp double arg_min(ND)
arg_min long double arg_min(LD)
arg_min double long arg_min(DL)
arg_min long timestamp arg_min(LN)
arg_min timestamp long arg_min(NL)
arg_min uuid timestamp arg_min(ZN)
arg_min timestamp uuid arg_min(NZ)

Key features:

  • Full parallel execution support with proper merge logic
  • Correct null handling for both keys and values
  • UUID comparison using unsigned long comparison

Test plan

  • Basic functionality tests for all 18 variants
  • GROUP BY clause tests
  • Null key handling (null keys are ignored)
  • Null value handling (returns null if value at min/max key is null)
  • Parallel execution tests with 4 workers (2M rows)
  • Parallel merge with null keys (50% null)
  • Parallel merge with null dest and valid src

🤖 Generated with Claude Code

@coderabbitai
Copy link

coderabbitai bot commented Jan 15, 2026

Walkthrough

This pull request introduces 24 new group-by aggregate functions: 12 arg_max variants and 12 arg_min variants, each supporting different type combinations (Double, Long, Timestamp, UUID). Each function includes a factory class and comprehensive test suite.

Changes

Cohort / File(s) Summary
ArgMax Function Implementations
core/src/main/java/io/questdb/griffin/engine/functions/groupby/ArgMax*.java (12 files)
Implements arg_max aggregations tracking maximum key values across type pairs: Double-Double, Double-Long, Double-Timestamp, Long-Double, Long-Timestamp, Timestamp-Double, Timestamp-Long, Timestamp-UUID, UUID-Timestamp. Each maintains per-group state with value and corresponding max key, handling null/NaN semantics and merge logic for parallel aggregation.
ArgMax Factory Implementations
core/src/main/java/io/questdb/griffin/engine/functions/groupby/ArgMax*Factory.java (12 files)
Provides FunctionFactory implementations declaring function signatures (arg_max(XX) format) and constructing corresponding ArgMax function instances.
ArgMin Function Implementations
core/src/main/java/io/questdb/griffin/engine/functions/groupby/ArgMin*.java (12 files)
Mirrors ArgMax implementations but for minimum key tracking across same type combinations. Handles null/NaN semantics, state merging, and per-group aggregation state management.
ArgMin Factory Implementations
core/src/main/java/io/questdb/griffin/engine/functions/groupby/ArgMin*Factory.java (12 files)
Provides FunctionFactory implementations declaring arg_min(XX) signatures and constructing corresponding ArgMin function instances.
Test Suites
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ArgMax*Test.java (9 files)
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ArgMin*Test.java (9 files)
Comprehensive test classes covering basic functionality, null handling (all-null values/keys), parallel execution with WorkerPool, chunky data processing, merge scenarios, and group-by correctness across multiple type combinations.
Test Formatting Update
core/src/test/java/io/questdb/test/griffin/engine/functions/array/DoubleArrayRoundFunctionFactoryTest.java
Converts string concatenations to text blocks for expected SQL outputs and type descriptions; no functional changes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Suggested labels

SQL, Enhancement

Suggested reviewers

  • puzpuzpuz
  • nwoolmer
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the main feature: adding arg_min() and arg_max() aggregate functions. It is concise, specific, and directly reflects the primary change in the changeset.
Description check ✅ Passed The PR description provides comprehensive details including a summary of the functions, a complete table of 18 supported type combinations, key features (parallel execution, null handling, UUID comparison), and a detailed test plan with checkmarks. It is clearly related to the changeset and provides meaningful context.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

🧹 Recent nitpick comments
core/src/test/java/io/questdb/test/griffin/engine/functions/array/DoubleArrayRoundFunctionFactoryTest.java (1)

38-271: Text block refactoring looks good, but appears unrelated to this PR's purpose.

This file tests the round() function for double arrays, which doesn't appear related to the arg_min/arg_max aggregate functions described in the PR objectives. The text block refactoring improves readability and is correctly implemented, but including unrelated formatting changes in a feature PR can make reviews more difficult and muddies the git history.

Consider keeping such refactoring separate from feature additions in future PRs to maintain clear separation of concerns.

core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ArgMaxTimestampDoubleGroupByFunctionFactoryTest.java (3)

171-177: Self-comparison in parallel test doesn't verify correctness.

assertSqlCursors is called with the same SQL for both expected and actual parameters. This only verifies that the query executes without error in parallel mode, but doesn't actually validate the result correctness.

Consider comparing against a sequential execution or a known-correct baseline:

💡 Suggested approach
-                        TestUtils.assertSqlCursors(
-                                engine,
-                                sqlExecutionContext,
-                                sql,
-                                sql,
-                                LOG
-                        );
+                        // Compare parallel execution result against sequential baseline
+                        String sequentialSql = "select sym, arg_max(value, key) from tab group by sym order by sym";
+                        // Run with single worker context for comparison, or use assertSql with expected values
+                        TestUtils.assertSqlCursors(
+                                engine,
+                                sqlExecutionContext,
+                                sequentialSql,  // expected (could use a single-threaded context)
+                                sql,            // actual (parallel)
+                                LOG
+                        );

Alternatively, compute expected results using a subquery or CTE that doesn't rely on arg_max, then compare.


200-206: Same self-comparison issue as noted above.

This test also compares the SQL query to itself. Consider applying the same fix to verify parallel results against a sequential or computed baseline.


229-235: Same self-comparison issue as noted above.

This test also compares the SQL query to itself.

core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ArgMaxTimestampUuidGroupByFunctionFactoryTest.java (1)

57-66: Consider strengthening parallel test assertions.

The pattern TestUtils.assertSqlCursors(engine, sqlExecutionContext, sql, sql, LOG) compares a query against itself. While this verifies the query executes without error and produces consistent results, it doesn't validate that the result is actually correct. For non-deterministic data (like rnd_uuid4()), this may be acceptable for testing parallel consistency, but it won't catch logic errors in the aggregation.

Consider adding at least one parallel test with deterministic data where the expected result can be explicitly verified, similar to the pattern used in testArgMaxSimple.

core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ArgMaxDoubleTimestampGroupByFunctionFactoryTest.java (1)

185-212: Consider clarifying test intent.

The timestamp_sequence(0, 1000) in the CASE expression always restarts from 0 for each row where x > 1000000, rather than continuing from the previous sequence. This still exercises the merge-null-dest-valid-src scenario, but the actual timestamp values will all be 0 for rows with valid keys.

This is likely intentional to test the merge logic (not the timestamp ordering), but a brief inline comment would clarify.

core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/ArgMinDoubleTimestampGroupByFunctionFactoryTest.java (1)

57-66: Consider adding a verification against a known result.

The assertSqlCursors(sql, sql) pattern validates determinism (same query returns same results) but doesn't verify correctness against a reference implementation. This is acceptable for large-scale parallel testing where manual expected values are impractical, but consider adding at least one parallel test with a smaller dataset that asserts specific expected values.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@bluestreak01
Copy link
Member Author

@CodeRabbit review

@coderabbitai
Copy link

coderabbitai bot commented Jan 15, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@glasstiger
Copy link
Contributor

[PR Coverage check]

😍 pass : 884 / 904 (97.79%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinDoubleTimestampGroupByFunction.java 44 47 93.62%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinDoubleDoubleGroupByFunction.java 44 47 93.62%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxDoubleDoubleGroupByFunction.java 44 47 93.62%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinDoubleLongGroupByFunction.java 44 47 93.62%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxDoubleTimestampGroupByFunction.java 44 47 93.62%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxDoubleLongGroupByFunction.java 44 47 93.62%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinTimestampUuidGroupByFunction.java 53 54 98.15%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxTimestampUuidGroupByFunction.java 53 54 98.15%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxDoubleDoubleGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinTimestampUuidGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxTimestampDoubleGroupByFunction.java 44 44 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinDoubleTimestampGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxDoubleLongGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxTimestampLongGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxTimestampUuidGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinLongTimestampGroupByFunction.java 44 44 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxLongTimestampGroupByFunction.java 44 44 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinUuidTimestampGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxTimestampDoubleGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinTimestampDoubleGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinTimestampLongGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxUuidTimestampGroupByFunction.java 45 45 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinDoubleLongGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinLongDoubleGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinTimestampLongGroupByFunction.java 44 44 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxDoubleTimestampGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxLongDoubleGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinLongTimestampGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinDoubleDoubleGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxTimestampLongGroupByFunction.java 44 44 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxLongDoubleGroupByFunction.java 44 44 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxLongTimestampGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinUuidTimestampGroupByFunction.java 45 45 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMaxUuidTimestampGroupByFunctionFactory.java 4 4 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinTimestampDoubleGroupByFunction.java 44 44 100.00%
🔵 io/questdb/griffin/engine/functions/groupby/ArgMinLongDoubleGroupByFunction.java 44 44 100.00%

@bluestreak01 bluestreak01 merged commit 8c90341 into master Jan 15, 2026
43 checks passed
@bluestreak01 bluestreak01 deleted the vi_arg_mm branch January 15, 2026 22:45
@bluestreak01 bluestreak01 added New feature Feature requests SQL Issues or changes relating to SQL execution labels Jan 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

New feature Feature requests SQL Issues or changes relating to SQL execution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants