Skip to content

Query Evaluator

nitro edited this page Apr 13, 2019 · 3 revisions

Query Evaluator

The Query Evaluator takes the Query object built by the Query builder and queries the PKB for answers. It first determines whether the query is "simple", where a simple query is one without clauses. Simple queries are evaluated directly by querying the PKB for what is been selected. A query that is not "simple", i.e. with clauses, will have each of its clauses evaluated separately and results stored (either a table or boolean). These results will then be merged depending on if its a table or boolean. Finally, the column with the select query will be selected and returned to the frontend.

Data representation for program queries

Query Evaluator gets a Query object from Query builder, which contains all the information of program queries. The data representation of program queries is the same as the description of Query object in Query Builder part.

Strategy for basic query evaluation(BQE)

As shown in the activity diagram of PQL Evaluator, at the beginning of the query evaluation, PQL Evaluator first checks whether contains any clauses. The Query that doesn't contain clause is a simple Query, otherwise, it's a complex query. The evaluating strategies for simple Queries and complex Queries are different.

As for a simple Query, if Query selects Boolean, PQL Evaluator returns true, otherwise, PQL Evaluator simply queries PKB for the data of the synonym or attrRef selected in the tuple and merge them.

As for the complex Query, PQL Evaluator evaluates each clause in Query by querying the PKB for data by calling the Getter API in PKB and filter the data according to the parameters in the clauses. The table merging method is reused here to filter the data. After that, we should get the correct result for each clause, then these results will be merged as the final result table which contains all the results that satisfy all the clauses. Then according to the tuple selected, we drop the columns in result table that are not selected and keep merging result table with tables of synonym and attrRef that are not included result table. At this point, PQL Evaluator gets sufficient information that can be returned. The last step is to format the result according to the requirements of SPA into a list of string and return the result.

During the evaluation, if we encounter any clause that is evaluated to be false, which means no data in PKB can satisfy the clause, the evaluation process will be terminated and return empty result or FALSE if Boolean is selected.

Example of filtering data during clause evaluation
Follows*(s,"5")

At first, PQL Evaluator calls getFollowsT() to queries PKB for the whole table of relation Follows* and the table for statement s. Then we construct a new table containing only "5". After merging these three tables, the result tables will contain results satisfying the clause Follows*(s,"5"). Because the tuple selected only contains Synonym or attrRef, therefore column "5" will not be selected and will be dropped, only column "s" will be returned for merging.

The same method is used for Parent, Parent*, Uses, Modifies, Follows,Follows*, Calls,Calls*,Next, AssignPatt, WhilePatt and IfPatt (except with and Next*). With clause and Next* clause are evaluated differently with other clauses, because With contains attrRef that other clauses don't contain and Next* is not stored in PKB so we can't apply the same method as other clauses.

Design decisions

  1. The above descried the way to get results for each clause differed from the way in iteration 1. Iteration 1 has one method for each type of clause to get result. In iteration 2, we combine many of them into one method.

Iteration 1

TABLE getDataFromFollowS(CLAUSE c);
TABLE getDataFromFollowST(CLAUSE c);
TABLE getDataFromParent(CLAUSE c);
TABLE getDataFromParentT(CLAUSE c);
......

Iteration 2:

TABLE dataFilter(RELATIONTABLE T, PARAMETERS S)
Note: With and Next* clauses can't use this method, need to have one method for their own. 
......
Considerations Iteration 1 Iteration 2 Importance of Consideration
Code Quality Creating methods for each type of clause repeat lots of code for querying the PKB for data and filtering data according to parameters in clauses. The code quality is low. Using one method can evaluate most of the clauses. The part of codes that works for querying the PKB and filtering the data is reused many times. The code quality of second approach is better. Important to keep a better code quality, since it keeps codes concise and easy to understand.
Correctness The previous version of PKB doesn't provide the whole relation table. Pql Evaluator needs to provide assign one value to one of the parameters in one clause to get values of the other parameters. In this case, only the first approach works. Currently, PKB provides whole relation table, therefore, the second approach produces correct out as the first approach.The explanation of how the second approach works is in Strategy for basic query evaluation. Therefore, in terms of correctness, they should be the same. Most important to make sure this component works correctly. Second approach works correctly as well after the improvement in PKB .

After PKB supports providing the whole relation table, it is obvious that the second way outperforms the first way in code quality. By applying the second approach, the lines of code reduce a lot but the functionality is still complete.

  1. In iteration 2, selection of attrRef is required. There are two different solutions available for this new requirement as shown below:
procedure p; call c;
Select <p,c.procName> ... clauses.

Result table would be the result of evaluating all the clauses. It contains c in this case.

Result Table:

procedure p call c
main 10
one 9
two 9
two 10

c.procName table:

c c.procName
4 one
9 two
10 three
Possible solutions:

Solution 1: If the element is not in the result table, query PKB for its data table and merge it with the result table. If the element is in the table, nothing will be done on the result table. Finally, extract all the data needed in the end.

Solution 2: Extract the values of the elements that are in tuple and result table, then do a cross product with those that are in tuple but not in the result table.

Considerations Solution 1 Solution 2 Importance of Consideration
Performance Solution 1 do the merge with attrRef that is not in the result table. Result table could be large, so the merging process could take a long time. Solution 2 reduces the size of result table by extracting the columns that can both be found in tuple and result table first, then do the merge. The performance should be better. Performance is an important consideration for PQL Evaluator since it's in SPA requirement.
Correctness Solution 1 should be correct. It is the brute force way to get the result. Could be wrong. In the example, c is in result table but not tuple. C is dropped. When we merge c.procName with procedure p, result would contain (main, one), (main two) that are not supposed to show. The most important consideration.

Decision: Although solution 2 has better performance, it may make mistakes in some cases. Because correctness is the first thing we need to guarantee, we need to choose solution 1.

Clone this wiki locally