Skip to content

Fix log(0.0::float8) should error, not return -inf#22564

Open
xiedeyantu wants to merge 1 commit into
apache:mainfrom
xiedeyantu:log
Open

Fix log(0.0::float8) should error, not return -inf#22564
xiedeyantu wants to merge 1 commit into
apache:mainfrom
xiedeyantu:log

Conversation

@xiedeyantu
Copy link
Copy Markdown
Member

Which issue does this PR close?

Rationale for this change

The current log implementation does not handle log(0) consistently across all execution paths. In some cases it can return infinity-like results instead of a clear error. This change makes zero input fail explicitly so the behavior is predictable and easier to reason about.

What changes are included in this PR?

This change adds zero-value validation in log.rs and applies it across the floating-point and decimal log code paths. Any input value equal to zero now returns cannot take logarithm of zero instead of continuing with a calculation that would produce an unexpected result.

The regression test in scalar.slt was also updated to expect the new error behavior for log(0).

Are these changes tested?

Yes. The sqllogictest coverage was updated to verify the new log(0) error behavior.

Are there any user-facing changes?

Yes. log(0) now returns an explicit error with the message cannot take logarithm of zero instead of producing an infinite result.

@github-actions github-actions Bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels May 27, 2026
Copy link
Copy Markdown
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiedeyantu
Thanks for the fix.
I found one issue that looks blocking: the new zero-value validation can still be bypassed by the existing expression simplification path. There is also one small cleanup suggestion below.

Comment thread datafusion/functions/src/math/log.rs
Comment thread datafusion/functions/src/math/log.rs Outdated
@xiedeyantu
Copy link
Copy Markdown
Member Author

@kosiew Thanks!

Copy link
Copy Markdown
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiedeyantu

Thanks for addressing the duplicate float validation/log callback cleanup and consolidating the logic into compute_float_log. I also appreciate the added zero-literal guard.

I still see one correctness issue that needs to be addressed before this can be approved. The log(a, power(a, b)) => b simplification can still bypass zero-value validation because it runs before the new zero-value guard.

Comment thread datafusion/functions/src/math/log.rs Outdated
Copy link
Copy Markdown
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiedeyantu
Thanks for the iteration.
There is still one correctness issue where the rewrites can bypass the runtime zero-value validation for non-literal expressions.

&info.get_data_type(&base)?,
)?)))
}
Expr::ScalarFunction(ScalarFunction { func, mut args })
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement on the literal zero handling. I think there is still a gap for non-literal expressions.

The guard only catches zero literals before simplification. For expressions like log(a, power(a, b)), the rewrite can still eliminate the log call entirely. That means rows where a = 0 never reach invoke_with_args, so the new cannot take logarithm of zero validation is skipped.

For example:

select log(column1, power(column1, 2))
from (values (0), (2)) as t(column1);

After the rewrite this returns:

2.0
2.0

but without the rewrite, the first row evaluates power(0, 2) = 0 and should raise the zero-logarithm error.

Could we either avoid this rewrite unless the value can be proven non-zero, or preserve the runtime validation for rows that evaluate to zero? A regression test with a column containing a zero row would help cover this case.

Comment thread datafusion/functions/src/math/log.rs Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the same issue applies to the log(a, a) -> 1 simplification.

For example:

select log(column1, column1)
from (values (0), (2)) as t(column1);

After simplification this returns:

1.0
1.0

but the a = 0 row would normally evaluate log(0, 0) and should raise the new zero-value error.

Could we gate this rewrite on proving the value is non-zero, or otherwise preserve the runtime validation path? It would be great to add a regression test that exercises a column value of zero rather than only literal-zero cases.

@github-actions github-actions Bot added the core Core DataFusion crate label Jun 1, 2026
@xiedeyantu
Copy link
Copy Markdown
Member Author

@xiedeyantu Thanks for the iteration. There is still one correctness issue where the rewrites can bypass the runtime zero-value validation for non-literal expressions.

@kosiew Thank you very much for your verification and review. I've reconsidered it now, and I think our simplify logic needs some adjustments. Perhaps we could intelligently support simplification in two modes: constant or a combination of constant and expression. So I've made some adjustments; could you please take another look and see if it's acceptable?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PostgreSQL compatibility: log(0.0::float8) should error, not return -inf

2 participants