Fix for failing tests due to dtype issue. #2407

pctablet505 · 2025-09-18T08:22:42Z

Fix for failing TAP presubmit tests:
Fix for failing tests on tensorflow backend:

This pull request addresses data type compatibility issues across different backends by standardizing the data type for position ID tensors and related operations to int32. This change ensures consistent behavior and prevents failures, particularly with the TensorFlow backend.

Backend compatibility improvements:

Updated the dtype argument for position_ids weights in both clip_layers.py and siglip_layers.py to use "int32" instead of int, resolving failures with TensorFlow and ensuring compatibility across all backends. [1] [2] [3]
Explicitly cast rel_pos to "int32" in the _make_log_bucket_position method of disentangled_self_attention.py to maintain consistent data types.

This pull request addresses integer dtype compatibility issues across different backends (TensorFlow, JAX, Torch) by explicitly setting the dtype to "int32" when creating position ID weights and casting relative positions. This ensures consistent behavior and prevents failures, particularly with TensorFlow.

Cross-backend dtype compatibility improvements:

Changed the dtype argument from int to "int32" in the add_weight calls for position_ids in both clip_layers.py and siglip_layers.py to ensure compatibility with all supported backends, especially TensorFlow. [1] [2] [3]
Added an explicit cast to "int32" for rel_pos in the _make_log_bucket_position method in disentangled_self_attention.py to standardize dtype handling.## Description of the change

Added checks for invalid inputs

Added tests to check invalid inputs

Fix for model not loading when using numpy behaviour with tensorflow

This reverts commit 3fdc7fd.

Updated the convert_tokenizer function to support both string and list formats for merges in the tokenizer config, improving compatibility with different tokenizer export formats.

Removed debug print statements and redundant merge format handling logic. Now only converts list-of-lists merge format to space-separated strings, streamlining the tokenizer conversion process.

Changed the dtype of position_ids from int to 'int64' in the CLIPVisionEmbedding layer to ensure correct device placement, particularly for TensorFlow backends.

Added explicit casting of rel_pos and abs_pos to int32 in the _make_log_bucket_position method to ensure correct data types for subsequent operations.

Corrects dtype casting in _make_log_bucket_position and _get_rel_pos methods to ensure consistency and prevent potential type errors. Uses the input dtype for bucket_pos and explicitly casts ids to int32.

gemini-code-assist

Summary of Changes

Hello @pctablet505, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses critical data type compatibility issues that were causing tests to fail, particularly with the TensorFlow backend. By standardizing integer data types to int32 for position ID tensors and related operations across various layers, the changes ensure consistent behavior and prevent future failures across different deep learning backends.

Highlights

Standardized Integer Dtypes: The data type for position_ids weights in clip_layers.py and siglip_layers.py has been explicitly set to "int32" to resolve TensorFlow backend failures and ensure cross-backend compatibility.
Consistent Relative Position Casting: The rel_pos tensor in the _make_log_bucket_position method within disentangled_self_attention.py is now explicitly cast to "int32" to maintain consistent data type handling.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request correctly addresses the dtype compatibility issues that were causing test failures on the TensorFlow backend. By explicitly setting integer dtypes to "int32", you've ensured consistent behavior across TensorFlow, JAX, and PyTorch, which aligns with the project's goal of being backend-agnostic.

The changes in clip_layers.py, siglip_layers.py, and deberta_v3/disentangled_self_attention.py are all sound. I've left a few suggestions to improve the comments in clip_layers.py and siglip_layers.py for better long-term maintainability, as the current versions are either a bit informal or potentially confusing.

Overall, this is a good fix that improves the robustness of the models across different backends.

keras_hub/src/models/clip/clip_layers.py

keras_hub/src/models/deberta_v3/disentangled_self_attention.py

keras_hub/src/models/siglip/siglip_layers.py

gemini-code-assist · 2025-09-18T08:24:34Z

keras_hub/src/models/siglip/siglip_layers.py

+            # Let the backend determine the int32 dtype. For example, tf
            # requires int64 for correct device placement, whereas jax and torch
            # don't.
-            dtype=int,
+            dtype="int32",


Similar to the change above, this comment is confusing. It says "Let the backend determine the int32 dtype", but the code is now explicitly setting dtype="int32" to ensure consistency across backends. Updating the comment to reflect this would make the code easier to understand and align with the project's style of writing backend-agnostic code.¹

Suggested change

# Let the backend determine the int32 dtype. For example, tf

# requires int64 for correct device placement, whereas jax and torch

# don't.

dtype=int,

dtype="int32",

# Explicitly use "int32" for backend compatibility. Using `int` can

# lead to dtype mismatches between backends (e.g., int64 on

# TensorFlow), while "int32" is consistent across all backends.

dtype="int32",

Style Guide References

Footnotes

All code must be backend-agnostic, supporting TensorFlow, JAX, and PyTorch. This change enforces that by using a specific dtype instead of letting the backend decide. (link) ↩

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

JyotinderSingh

The dtype issue is coming up due to a bug that was recently introduced in Keras. We should fix it at the core library level instead of modifying the keras-hub models.

I've reported the issue for tracking here: keras-team/keras#21677

JyotinderSingh · 2025-09-18T08:44:45Z

keras_hub/src/models/deberta_v3/disentangled_self_attention.py

        sign = ops.sign(rel_pos)
-        mid = self.bucket_size // 2
-        mid = ops.cast(mid, dtype=dtype)
+        mid = ops.cast(self.bucket_size // 2, dtype=dtype)


This change doesn't seem to do anything, we can omit it.

JyotinderSingh · 2025-09-18T09:13:22Z

keras_hub/src/models/siglip/siglip_layers.py

+            # Explicitly use "int32" for backend compatibility. Using `int` can
+            # lead to dtype mismatches between backends (e.g., int64 on
+            # TensorFlow), while "int32" is consistent across all backends.
+            dtype="int32",


We can't modify this to be int32 directly. The original, removed comment explains why.

In TensorFlow, "int32" tensors are typically placed on the CPU, which can cause issues with graph generation as parts of the model would reside on different devices. "int" usually maps to "int64" and stays on the GPU.

reference: https://stackoverflow.com/questions/37439299/no-gpu-kernel-for-an-int32-variable-op/37452938#37452938

JyotinderSingh · 2025-09-18T09:13:29Z

keras_hub/src/models/siglip/siglip_layers.py

+            # Explicitly use "int32" for backend compatibility. Using `int` can
+            # lead to dtype mismatches between backends (e.g., int64 on
+            # TensorFlow), while "int32" is consistent across all backends.
+            dtype="int32",


Same as the other comment.

JyotinderSingh · 2025-09-18T09:13:36Z

keras_hub/src/models/deberta_v3/disentangled_self_attention.py

    def _get_rel_pos(self, num_positions):
        ids = ops.arange(num_positions)
-        ids = ops.cast(ids, dtype="int")
+        ids = ops.cast(ids, dtype="int32")


same as the other comment.

JyotinderSingh · 2025-09-22T05:16:12Z

This issue was fixed in keras-team/keras#21679.

pctablet505 and others added 28 commits April 17, 2025 10:26

Update gemma3_causal_lm_preprocessor.py

af584b4

Added checks for invalid inputs

Update gemma3_causal_lm_preprocessor.py

dc4ae8c

Update gemma3_causal_lm_preprocessor_test.py

07c5c77

Added tests to check invalid inputs

Update reversible_embedding.py

3fdc7fd

Fix for model not loading when using numpy behaviour with tensorflow

Merge branch 'master' of https://github.com/pctablet505/keras-hub

fa57e33

upadated Gemma3InterleaveEmbeddings

8da3303

Update gemma3_interleave_embeddings.py

adac2c6

Revert "Update reversible_embedding.py"

bd27ec0

This reverts commit 3fdc7fd.

Merge branch 'keras-team:master' into master

f5163e8

Update gemma3_interleave_embeddings.py

1904136

Merge branch 'keras-team:master' into master

552fecb

Update convert_llama3.py

13adc34

Merge branch 'keras-team:master' into master

63f965f

Handle multiple merge formats in Llama3 tokenizer conversion

aeffacf

Updated the convert_tokenizer function to support both string and list formats for merges in the tokenizer config, improving compatibility with different tokenizer export formats.

Merge branch 'master' of https://github.com/pctablet505/keras-hub

1df65da

Update convert_llama3.py

2fe19a3

Simplify merge format handling in convert_llama3.py

a03f499

Removed debug print statements and redundant merge format handling logic. Now only converts list-of-lists merge format to space-separated strings, streamlining the tokenizer conversion process.

Update convert_llama3.py

b65bb85

Merge branch 'master' of https://github.com/pctablet505/keras-hub

e862d0c

Merge branch 'keras-team:master' into master

11cfd9e

Set position_ids dtype to int64 in CLIPVisionEmbedding

445529d

Changed the dtype of position_ids from int to 'int64' in the CLIPVisionEmbedding layer to ensure correct device placement, particularly for TensorFlow backends.

Cast rel_pos and abs_pos to int32 in log bucket

93a3dec

Added explicit casting of rel_pos and abs_pos to int32 in the _make_log_bucket_position method to ensure correct data types for subsequent operations.

Merge branch 'master' of https://github.com/pctablet505/keras-hub

36d1f52

Update clip_layers.py

249df3a

Fixing dtype issue in siglip layer

6e49a49

Fixing the comment

a3811e1

Merge remote-tracking branch 'suhana/siglip'

82b2a9c

Fix dtype casting in relative position bucketing

5cf785d

Corrects dtype casting in _make_log_bucket_position and _get_rel_pos methods to ensure consistency and prevent potential type errors. Uses the input dtype for bucket_pos and explicitly casts ids to int32.

pctablet505 requested a review from abheesht17 September 18, 2025 08:22

gemini-code-assist bot reviewed Sep 18, 2025

View reviewed changes

pctablet505 and others added 2 commits September 18, 2025 13:55

Update keras_hub/src/models/clip/clip_layers.py

128362a

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

reformat

93b21c8

pctablet505 requested a review from JyotinderSingh September 18, 2025 08:38

JyotinderSingh requested changes Sep 18, 2025

View reviewed changes

JyotinderSingh closed this Sep 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix for failing tests due to dtype issue. #2407

Fix for failing tests due to dtype issue. #2407

pctablet505 commented Sep 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Sep 18, 2025

Uh oh!

JyotinderSingh left a comment

Uh oh!

JyotinderSingh Sep 18, 2025

Uh oh!

JyotinderSingh Sep 18, 2025

Uh oh!

JyotinderSingh Sep 18, 2025

Uh oh!

JyotinderSingh Sep 18, 2025

Uh oh!

JyotinderSingh commented Sep 22, 2025

Uh oh!

Uh oh!

Fix for failing tests due to dtype issue. #2407

Fix for failing tests due to dtype issue. #2407

Conversation

pctablet505 commented Sep 18, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist bot Sep 18, 2025

Choose a reason for hiding this comment

Style Guide References

Footnotes

Uh oh!

JyotinderSingh left a comment

Choose a reason for hiding this comment

Uh oh!

JyotinderSingh Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

JyotinderSingh Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

JyotinderSingh Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

JyotinderSingh Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

JyotinderSingh commented Sep 22, 2025

Uh oh!

Uh oh!