Overriding profile endpoint with analyze endpoint with operator tree and profiling#5568
Overriding profile endpoint with analyze endpoint with operator tree and profiling#5568Krish-Gandhi wants to merge 9 commits into
Conversation
…and profiling Signed-off-by: Krish Gandhi <kjg2352@gmail.com>
Signed-off-by: Krish Gandhi <kjg2352@gmail.com>
PR Reviewer Guide 🔍(Review updated until commit ff87829)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to ff87829 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit 7f164a0
Suggestions up to commit 727e3da
Suggestions up to commit 127dfe7
Suggestions up to commit 50ddb92
Suggestions up to commit 726b7e6
|
Signed-off-by: Krish Gandhi <kjg2352@gmail.com>
Signed-off-by: Krish Gandhi <kjg2352@gmail.com>
Signed-off-by: Krish Gandhi <kjg2352@gmail.com>
726b7e6 to
50ddb92
Compare
PR Code Analyzer ❗AI-powered 'Code-Diff-Analyzer' found issues on commit ff87829.
The table above displays the top 10 most important findings. Pull Requests Author(s): Please update your Pull Request according to the report above. Repository Maintainer(s): You can Thanks. |
|
Persistent review updated to latest commit 50ddb92 |
Signed-off-by: Krish Gandhi <kjg2352@gmail.com>
|
Persistent review updated to latest commit 127dfe7 |
Signed-off-by: Krish Gandhi <kjg2352@gmail.com>
|
Persistent review updated to latest commit 727e3da |
|
Let's update the |
|
Let's also document the current limitation for |
…' on complex queries Signed-off-by: Krish Gandhi <kjg2352@gmail.com>
|
Persistent review updated to latest commit 7f164a0 |
…, which is not a part of this PR Signed-off-by: Krish Gandhi <kjg2352@gmail.com>
|
Persistent review updated to latest commit ff87829 |
|
Updated |
Description
analyzeendpoint for PPL queries, which can be activated by passing"analyze": trueas a request body parameter.profileendpoint to runanalyzefunctionality.analyzeflag through transport and request parsing.operator_tree.operator_treenodes.profileplanresponse withoperator_treenodes to calculate time taken by each node.Note: This PR provides a simple implementation of creating the
operator_tree, which works for simpler queries. The logic doesn't hold for queries that produce non-linear physical plan trees (for example,JOINs). This only affects theoperator_tree, meaning the subset of the response that corresponds toprofileis 100% functional. Therefore, in this scenario, theanalyzeendpoint will "fallback" to theprofileoutput by returning a response mirroring theprofileendpoint (by only including theprofile,schema,datarows,total, andsizefields in the response). Thisoperator_treeissue will be addressed in a later PR.Important
This PR overrides the existing
profileendpoint by routing all requests with either"analyze": trueor"profile": true(or both) topplService.analyze()inTransportPPLQueryAction.java. For current end-users of theprofileendpoint, this will make little difference, as the response fromprofileis a subset of the response fromanalyze. In simpler terms, current end-users can make no changes and have similar results.All of the existing code for
profilestill exists and is in place. This makes it extremely simple to separateanalyzeandprofileagain. This can be done by removing the|| transformedRequest.profile()part of the boolean expression in line 200 ofTransportPPLQueryAction.java.plugin/src/main/java/org/opensearch/sql/plugin/transport/TransportPPLQueryAction.java:193-209:Example Query and Response
The following
curlcommand will run the querysource=accounts | where age < 30 | eval full_name = firstname + \" \" + lastname | fields full_name, email, ageon theanalyzeendpoint:The response of this will be as follows. (NOTE: The
"logicalPlan"and"physicalPlan"fields are included for debugging purposes and should not be included in the final version of this endpoint.)Performance of
analyzeAfter writing a benchmarking script to run 20 queries on a sample 10 GB dataset, the results were as follows. This shows that analyze has a small enough overhead to justify running it by default on the Discover tab on the OpenSearch Dashboard.
Related Issues
#5500
Resolves #4343
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.