Skip to content

git-reader - improve collection changeset performance by caching the sorted refs#1388

Open
alexcottner wants to merge 1 commit into
mainfrom
rmst-438
Open

git-reader - improve collection changeset performance by caching the sorted refs#1388
alexcottner wants to merge 1 commit into
mainfrom
rmst-438

Conversation

@alexcottner

Copy link
Copy Markdown
Contributor

RMST-438
When investigating CPU spikes, found that the raw_listall_references call was responsible for 75% of the effort in get_collection_changeset.

Caching the results from this gives us a pretty good performance boost in some synthetic testing.

Tests were running jmeter with 100 threads in 100 loops.

Before:
Pull ms-images changeset: 3:45
Pull an invalid collection's changeset: 2:30

After:
Pull ms-images changeset: 1:40
Pull an invalid collection's changeset: 0:18

@leplatrem leplatrem left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I have doubt about caching logics though

Comment thread git-reader/app.py
return decorator


@lru_cache(maxsize=500)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this 500 come from? (number of possible collections + margin? maybe better as a config)

Comment thread git-reader/app.py
def filter_refs(
repo: pygit2.Repository,
bid: str,
cid: str,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we need any cache_bust here? How does the caching of references catch up the git repo update?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants