Skip to content

fix(core): fix bad connection state after TLS session initialization failure#6529

Merged
bluestreak01 merged 1 commit intomasterfrom
jh_context_double_close
Dec 12, 2025
Merged

fix(core): fix bad connection state after TLS session initialization failure#6529
bluestreak01 merged 1 commit intomasterfrom
jh_context_double_close

Conversation

@jerrinot
Copy link
Contributor

@jerrinot jerrinot commented Dec 12, 2025

When TLS session initialization failed, then the connection context was closed twice, and it could be also pooled twice. This could lead to issues in later connections.

Joint work with @RaphDal

Note for reviewers: disconnect() enqueues the context in the disconnectQueue. The queue is later processed, and the connection is closed.

…failure

When TLS session initialization failed, then the connection context
was closed twice, and it could be also pooled twice. This could
lead to issues in later connections.

Joint work with @RaphDal
@jerrinot jerrinot added the Bug Incorrect or unexpected behavior label Dec 12, 2025
@coderabbitai
Copy link

coderabbitai bot commented Dec 12, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch jh_context_double_close

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jerrinot
Copy link
Contributor Author

hello @coderabbitai, my favourite robot, please review this fine art! :)

@coderabbitai
Copy link

coderabbitai bot commented Dec 12, 2025

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash

# Fetch the PR details including the diff
gh pr view 6529 --json files,additions,deletions,title,body

# Get the diff for this PR
gh pr diff 6529

Length of output: 1380


🏁 Script executed:

#!/bin/bash

# Get more context around the changed lines
gh pr diff 6529 --patch | head -60

# Look for similar patterns where both done() and disconnect() might be called
rg -n -A2 -B2 "ioContextFactory\.done.*\n.*disconnect\(" core/src/main/java/io/questdb/network/

# Check how disconnect() is implemented to understand the cleanup
ast-grep --pattern 'disconnect($_, $_) {
  $$$
}'

# Look for the disconnect method in AbstractIODispatcher
rg -n -A10 "protected.*void disconnect\(" core/src/main/java/io/questdb/network/AbstractIODispatcher.java

Length of output: 1537


🏁 Script executed:

#!/bin/bash

# Get the full context of the method containing the fix
cat core/src/main/java/io/questdb/network/AbstractIODispatcher.java | sed -n '200,250p'

# Look for the disconnect method implementation
rg -n -A15 "^\s*protected.*disconnect\(C context" core/src/main/java/io/questdb/network/AbstractIODispatcher.java

# Search for other places where ioContextFactory.done() is called
rg -n -C3 "ioContextFactory\.done\(" core/src/main/java/io/questdb/network/AbstractIODispatcher.java

# Look for the disconnectQueue mentioned in the PR description
rg -n -C2 "disconnectQueue" core/src/main/java/io/questdb/network/AbstractIODispatcher.java

Length of output: 2990


🏁 Script executed:

#!/bin/bash

# Check if there are any other exception handlers that might have similar patterns
rg -n -B5 -A5 "catch.*Exception" core/src/main/java/io/questdb/network/AbstractIODispatcher.java | grep -A10 -B10 "ioContextFactory.done"

# Look for any other direct calls to done() that might be problematic
rg -n -B3 -A3 "ioContextFactory\.done" core/src/main/java/io/questdb/network/

# Check the IOContextFactory interface to understand the done() method contract
fd -t f "IOContextFactory.java" --exec cat {}

Length of output: 1887


@jerrinot, Hello! Thank you for the kind words! 🐰 I'd be delighted to review this fine piece of work!

Review Summary

This is an excellent fix for a subtle but serious bug! The change is minimal, surgical, and addresses the root cause perfectly.

Analysis

The Problem

When TLS session initialization failed, the exception handler was calling both:

  1. ioContextFactory.done(connectionContext) - immediately freeing the context
  2. disconnect(connectionContext, ...) - enqueuing for cleanup via disconnectQueue

This caused a double-cleanup scenario where:

  • The context was freed twice
  • The context could be returned to the pool twice
  • Later connections could receive corrupted/invalid context objects

The Solution

By removing the immediate ioContextFactory.done() call, the cleanup now follows the proper flow:

disconnect() → disconnectQueue → processDisconnects() → ioContextFactory.done()

Why This Works

Looking at the code flow:

  • Line 189-190: disconnect() enqueues the context into disconnectQueue
  • Line 498: processDisconnects() consumes the queue
  • Line 467-471: Eventually calls ioContextFactory.done(context) to perform the actual cleanup

The manual call to done() was redundant and harmful. The disconnect() method already handles everything properly through the asynchronous disconnect queue mechanism.

Verdict

LGTM! This is a textbook example of how to fix a resource management bug - identify the double-cleanup, remove the redundant call, and let the established cleanup flow handle it consistently. The one-line change speaks to the quality of the analysis that went into finding this issue.

Great work! 🎉

🐰✨

   ^_^
  (o.o)  "Perfectly debugged!"
  (> <)
✅ Actions performed

Comments resolved.

@glasstiger
Copy link
Contributor

[PR Coverage check]

😍 pass : 0 / 0 (0%)

@bluestreak01 bluestreak01 merged commit 4ac6040 into master Dec 12, 2025
44 checks passed
@bluestreak01 bluestreak01 deleted the jh_context_double_close branch December 12, 2025 15:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Incorrect or unexpected behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants