Fix log(0.0::float8) should error, not return -inf#22564
Conversation
kosiew
left a comment
There was a problem hiding this comment.
@xiedeyantu
Thanks for the fix.
I found one issue that looks blocking: the new zero-value validation can still be bypassed by the existing expression simplification path. There is also one small cleanup suggestion below.
|
@kosiew Thanks! |
kosiew
left a comment
There was a problem hiding this comment.
Thanks for addressing the duplicate float validation/log callback cleanup and consolidating the logic into compute_float_log. I also appreciate the added zero-literal guard.
I still see one correctness issue that needs to be addressed before this can be approved. The log(a, power(a, b)) => b simplification can still bypass zero-value validation because it runs before the new zero-value guard.
kosiew
left a comment
There was a problem hiding this comment.
@xiedeyantu
Thanks for the iteration.
There is still one correctness issue where the rewrites can bypass the runtime zero-value validation for non-literal expressions.
| &info.get_data_type(&base)?, | ||
| )?))) | ||
| } | ||
| Expr::ScalarFunction(ScalarFunction { func, mut args }) |
There was a problem hiding this comment.
Nice improvement on the literal zero handling. I think there is still a gap for non-literal expressions.
The guard only catches zero literals before simplification. For expressions like log(a, power(a, b)), the rewrite can still eliminate the log call entirely. That means rows where a = 0 never reach invoke_with_args, so the new cannot take logarithm of zero validation is skipped.
For example:
select log(column1, power(column1, 2))
from (values (0), (2)) as t(column1);After the rewrite this returns:
2.0
2.0
but without the rewrite, the first row evaluates power(0, 2) = 0 and should raise the zero-logarithm error.
Could we either avoid this rewrite unless the value can be proven non-zero, or preserve the runtime validation for rows that evaluate to zero? A regression test with a column containing a zero row would help cover this case.
There was a problem hiding this comment.
I think the same issue applies to the log(a, a) -> 1 simplification.
For example:
select log(column1, column1)
from (values (0), (2)) as t(column1);After simplification this returns:
1.0
1.0
but the a = 0 row would normally evaluate log(0, 0) and should raise the new zero-value error.
Could we gate this rewrite on proving the value is non-zero, or otherwise preserve the runtime validation path? It would be great to add a regression test that exercises a column value of zero rather than only literal-zero cases.
@kosiew Thank you very much for your verification and review. I've reconsidered it now, and I think our simplify logic needs some adjustments. Perhaps we could intelligently support simplification in two modes: constant or a combination of constant and expression. So I've made some adjustments; could you please take another look and see if it's acceptable? |
Which issue does this PR close?
log(0.0::float8)should error, not return-inf#22261 .Rationale for this change
The current
logimplementation does not handlelog(0)consistently across all execution paths. In some cases it can return infinity-like results instead of a clear error. This change makes zero input fail explicitly so the behavior is predictable and easier to reason about.What changes are included in this PR?
This change adds zero-value validation in log.rs and applies it across the floating-point and decimal log code paths. Any input value equal to zero now returns
cannot take logarithm of zeroinstead of continuing with a calculation that would produce an unexpected result.The regression test in scalar.slt was also updated to expect the new error behavior for
log(0).Are these changes tested?
Yes. The sqllogictest coverage was updated to verify the new
log(0)error behavior.Are there any user-facing changes?
Yes.
log(0)now returns an explicit error with the messagecannot take logarithm of zeroinstead of producing an infinite result.