-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore: support addition between a timestamp and a timedelta #1369
Conversation
bigframes/core/compile/ibis_types.py
Outdated
@@ -402,7 +404,12 @@ def literal_to_ibis_scalar( | |||
return bigframes_vendored.ibis.null() | |||
|
|||
scalar_expr = bigframes_vendored.ibis.literal(literal) | |||
if ibis_dtype: | |||
if isinstance(literal, datetime.timedelta): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we allow pandas, numpy, and pyarrow scalars here, too?
if isinstance(literal, datetime.timedelta): | |
if isinstance(literal, (datetime.timedelta, numpy.timedelta64, pandas.Timedelta, pyarrow.DurationScalar)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure! Pandas timedelta is a subclass of datetime.timedelta, but I can make it more explicit.
Added support for numpy.
I eventually decided not to implement pyarrow duration because:
- It causes more unrelated failures than I expected in Python 3.9 and 3.10 envs for some reasons.
- Pandas does not support adding pyarrow duration literals to series
Maybe we can add support for pyarrow in the future, but at this moment it may not be worth it
bigframes/core/utils.py
Outdated
def timedelta_to_micros(td: typing.Union[pd.Timedelta, datetime.timedelta]) -> int: | ||
if isinstance(td, pd.Timedelta): | ||
# td.value returns total nanoseconds. | ||
return td.value // 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like pandas.Timedelta
has units: https://pandas.pydata.org/docs/reference/api/pandas.Timedelta.html
Please make this robust to various units.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that is for the constructor arg, but as per the docs, "The .value attribute is always in ns."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Pandas always converts the input to values in nanoseconds, no matter what unit we use in the constructor.
bigframes/dtypes.py
Outdated
@@ -578,6 +580,8 @@ def _infer_dtype_from_python_type(type: type) -> Dtype: | |||
return DATE_DTYPE | |||
if issubclass(type, datetime.time): | |||
return TIME_DTYPE | |||
if issubclass(type, datetime.timedelta): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, do we need numpy, pandas, and pyarrow object detection here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Left the PyArrow out for this one.
146c690
to
c680948
Compare
c9a5021
to
b5b69cc
Compare
bigframes/core/compile/ibis_types.py
Outdated
if isinstance( | ||
literal, | ||
(datetime.timedelta, pd.Timedelta, numpy.timedelta64), | ||
): | ||
# numpy timedelta is compatible with Ibis, so we process them separately. | ||
return bigframes_vendored.ibis.literal( | ||
utils.timedelta_to_micros(literal), ibis_dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we just replace literal expressions (as well as column defs in leaves) in the tree when we replace all the ops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I moved the conversion to the rewrite module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to rename the module rewrite.operators
to something like rewrite.timedelta_expression
to better reflect its behavior, perhaps in another PR
@scalar_op_compiler.register_binary_op(ops.timestamp_add_op) | ||
def timestamp_add_op_impl(x: ibis_types.TimestampValue, y: ibis_types.IntegerValue): | ||
return x + y.to_interval("us") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this syntax look like from this? Is it a clean TIMESTAMP_ADD(timestamp_col, INTERVAL duration MICROSECOND)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions#timestamp_add
Though this has less to do with SQL syntax than the ease of wiring code. Ibis does support interval + timestamp and timestamp + interval, but that means I also need to check which side of the input is integer here.
If I can standardize the order of types in the rewrite module, then I don't need to check ibis type here :)
No description provided.