Skip to content

Constant number of db queries in contributor index#2715

Draft
Kakadus wants to merge 5 commits intoe-valuation:mainfrom
Kakadus:constant-contributor-index
Draft

Constant number of db queries in contributor index#2715
Kakadus wants to merge 5 commits intoe-valuation:mainfrom
Kakadus:constant-contributor-index

Conversation

@Kakadus
Copy link
Copy Markdown
Collaborator

@Kakadus Kakadus commented Apr 29, 2026

concerns #1229

  1. prefetch patch, discussed in PR prefetch participant and voter counts in contributor index #2714. (200x10 evaluation: 50s to 44s)
  2. annotate evaluations in calculate_average_course_distribution
  • I think this may even cause a significant slowdown, but does reduce the number of db queries. The problem is that course.evaluations.all() does not have the voter and participant counts which are fetched in calculate_average_distribution.
  • The patch allows callers of calculate_average_distribution to prefetch & annotate all relevant evaluations beforehand (+ some sorting magic to avoid grouping the evaluations by course). I'm sure something smarter is possible. Maybe you have some thoughts about this already?
  1. Very crude implementation of the Evaluation.can_be_seen_by(x) functions as django Q objects to allow annotating those attributes on the evaluation objects. This reduced the 200x10 test time significantly to just 13s (-70%). However, the Q functions aren't readable at all.

2 & 3 are very hacky and were just results of me experimenting around in order to start a discussion.

Comment thread evap/evaluation/models.py
Comment on lines +743 to +759
@staticmethod
def can_be_seen_by_Q(user):
if user.is_manager:
return Q()

if user.is_reviewer:
return ~Q(state=Evaluation.State.NEW) & ~Q(course__semester__results_are_archived=True)

base_q = ~Q(state=Evaluation.State.NEW)
if user.is_external:
return base_q & Evaluation.user_is_responsible_or_contributor_or_delegate_Q(user) | Q(participants=user)

return base_q & (
~Q(course__is_private=True)
| Evaluation.user_is_responsible_or_contributor_or_delegate_Q(user)
| Q(participants=user)
)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no way to evaluate a Q instance on a python model instance, right? If we can in any way prevent having to re-implement this logic, I would like to prefer that. Hunting after subtle semantic difference bugs between the SQL world and the python world is on my not-to-do-list for this year.

One approach that should always work is: Fetch a list of instances (e.g. evaluations visible to the user) without any prefetching (should be a few milliseconds), and then generate a list of IDs in python and pass that to follow-up queries (which now do "heavy" prefetching)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(random line): What is the struggle with the 200x10 case? We would fetch 2000 evaluation instances, and we'd have to prefetch 200 courses, ~20 course types, ~10 programs, ~400 evaluations(instances are already in memory), 1 semester, ~200 responsible. All of this are toy numbers. I suspect the total database query run time for the main query and all the prefetch queries is in the single to two-digit milliseconds.

None of this justifies almost a minute of runtime. Touching on the discussion in #2235: For all test_data.json, we render all result rows for all evaluations (thousands) in a few seconds. There is nothing fancy in the code for that. How can fetching and rendering a few hundred evaluations possibly be slower than that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants