Return DSA area for hash table from GetNamedDSHash()

Started by Sami Imseih4 days ago4 messageshackers
Jump to latest
#1Sami Imseih
samimseih@gmail.com

Hi,

While working on extending tests for dshash.c [1][/messages/by-id/acXCJODjsCytdpwT@paquier.xyz%5D, I realized that a
user that creates a hash table with GetNamedDSHash() has no way
to cap the size of the dsa area underpinning the table by using
dsa_set_size_limit(). This is because the dsa_area created using
this API is not exposed to the user.

This is a gap for users of the GetNamedDSHash() API,
because it's very likely that the callers don't want runaway growth of
these hash tables.

Attached is a new API, dshash_get_dsa_area() that takes in a dshash_table
and returns the area. The caller can then use dsa_set_size_limit() to limit
the size.

We could change the GetNamedDSHash() API to take in a size, but that
will not be ideal since a caller may want to change the size dynamically after
the hash table is created.

I don't have a patch for this yet, but I also think it will make sense for
pg_dsm_registry_allocations to also show the max_size

postgres=# select * from pg_dsm_registry_allocations;
name | type | size
------------------------+---------+---------
test_dsm_registry_dsa | area | 1048576
test_dsm_registry_hash | hash | 1048576
test_dsm_registry_dsm | segment | 20
(3 rows)

Thoughts?

[1]: [/messages/by-id/acXCJODjsCytdpwT@paquier.xyz%5D

--
Sami Imseih
Amazon Web Services (AWS)

Attachments:

v1-0001-Add-function-to-return-DSA-area-for-a-dshash-tabl.patchapplication/octet-stream; name=v1-0001-Add-function-to-return-DSA-area-for-a-dshash-tabl.patchDownload+12-1
#2jie wang
jugierwang@gmail.com
In reply to: Sami Imseih (#1)
Re: Return DSA area for hash table from GetNamedDSHash()

Sami Imseih <samimseih@gmail.com> 于2026年4月7日周二 06:56写道:

Hi,

While working on extending tests for dshash.c [1], I realized that a
user that creates a hash table with GetNamedDSHash() has no way
to cap the size of the dsa area underpinning the table by using
dsa_set_size_limit(). This is because the dsa_area created using
this API is not exposed to the user.

This is a gap for users of the GetNamedDSHash() API,
because it's very likely that the callers don't want runaway growth of
these hash tables.

Attached is a new API, dshash_get_dsa_area() that takes in a dshash_table
and returns the area. The caller can then use dsa_set_size_limit() to limit
the size.

We could change the GetNamedDSHash() API to take in a size, but that
will not be ideal since a caller may want to change the size dynamically
after
the hash table is created.

I don't have a patch for this yet, but I also think it will make sense for
pg_dsm_registry_allocations to also show the max_size

postgres=# select * from pg_dsm_registry_allocations;
name | type | size
------------------------+---------+---------
test_dsm_registry_dsa | area | 1048576
test_dsm_registry_hash | hash | 1048576
test_dsm_registry_dsm | segment | 20
(3 rows)

Thoughts?

[1] [/messages/by-id/acXCJODjsCytdpwT@paquier.xyz%5D

--
Sami Imseih
Amazon Web Services (AWS)

Hi,

I think an assert check could be added in this patch for better safety.
Assert(hash_table != NULL);

Best regards,
--
wang jie

#3Michael Paquier
michael@paquier.xyz
In reply to: Sami Imseih (#1)
Re: Return DSA area for hash table from GetNamedDSHash()

On Mon, Apr 06, 2026 at 05:56:21PM -0500, Sami Imseih wrote:

Attached is a new API, dshash_get_dsa_area() that takes in a dshash_table
and returns the area. The caller can then use dsa_set_size_limit() to limit
the size.

+dsa_area *
+dshash_get_dsa_area(dshash_table *hash_table)
+{
+	Assert(hash_table->control->magic == DSHASH_MAGIC);
+
+	return hash_table->area;

Rather than an API that returns the DSA area, perhaps it would be more
natural to have a wrapper that calls dsa_set_size_limit(), using an
existing dshash_table in input?
--
Michael

#4Sami Imseih
samimseih@gmail.com
In reply to: Michael Paquier (#3)
Re: Return DSA area for hash table from GetNamedDSHash()

Thanks for the replies!

I think an assert check could be added in this patch for better safety.
Assert(hash_table != NULL);

I followed the same approach we take for dshash_destroy() and
dshash_get_hash_table_handle(). The caller is responsible for
not passing in a NULL hash table, else that assert will segfault.

+dsa_area *
+dshash_get_dsa_area(dshash_table *hash_table)
+{
+       Assert(hash_table->control->magic == DSHASH_MAGIC);
+
+       return hash_table->area;

Rather than an API that returns the DSA area, perhaps it would be more
natural to have a wrapper that calls dsa_set_size_limit(), using an
existing dshash_table in input?

hm, having GetNamedDSA return dsa_area for direct use while requiring
a special wrapper for the dshash case creates an inconsistent API in
dsm_registry.h. dshash_get_dsa_area() means either way the dsa_area is
obtained, dsa_set_size_limit() can be used to set the size.

--
Sami