Commit a937b37
revision: quit pruning diff more quickly when possible
When the revision traversal machinery is given a pathspec,
we must compute the parent-diff for each commit to determine
which ones are TREESAME. We set the QUICK diff flag to avoid
looking at more entries than we need; we really just care
whether there are any changes at all.
But there is one case where we want to know a bit more: if
--remove-empty is set, we care about finding cases where the
change consists only of added entries (in which case we may
prune the parent in try_to_simplify_commit()). To cover that
case, our file_add_remove() callback does not quit the diff
upon seeing an added entry; it keeps looking for other types
of entries.
But this means when --remove-empty is not set (and it is not
by default), we compute more of the diff than is necessary.
You can see this in a pathological case where a commit adds
a very large number of entries, and we limit based on a
broad pathspec. E.g.:
perl -e '
chomp(my $blob = `git hash-object -w --stdin </dev/null`);
for my $a (1..1000) {
for my $b (1..1000) {
print "100644 $blob\t$a/$b\n";
}
}
' | git update-index --index-info
git commit -qm add
git rev-list HEAD -- .
This case takes about 100ms now, but after this patch only
needs 6ms. That's not a huge improvement, but it's easy to
get and it protects us against even more pathological cases
(e.g., going from 1 million to 10 million files would take
ten times as long with the current code, but not increase at
all after this patch).
This is reported to minorly speed-up pathspec limiting in
real world repositories (like the 100-million-file Windows
repository), but probably won't make a noticeable difference
outside of pathological setups.
This patch actually covers the case without --remove-empty,
and the case where we see only deletions. See the in-code
comment for details.
Note that we have to add a new member to the diff_options
struct so that our callback can see the value of
revs->remove_empty_trees. This callback parameter could be
passed to the "add_remove" and "change" callbacks, but
there's not much point. They already receive the
diff_options struct, and doing it this way avoids having to
update the function signature of the other callbacks
(arguably the format_callback and output_prefix functions
could benefit from the same simplification).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>1 parent 9752ad0 commit a937b37
2 files changed
Lines changed: 14 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
180 | 180 | | |
181 | 181 | | |
182 | 182 | | |
| 183 | + | |
183 | 184 | | |
184 | 185 | | |
185 | 186 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
394 | 394 | | |
395 | 395 | | |
396 | 396 | | |
397 | | - | |
398 | | - | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
399 | 407 | | |
400 | 408 | | |
401 | 409 | | |
| |||
406 | 414 | | |
407 | 415 | | |
408 | 416 | | |
| 417 | + | |
409 | 418 | | |
410 | 419 | | |
411 | | - | |
| 420 | + | |
412 | 421 | | |
413 | 422 | | |
414 | 423 | | |
| |||
1346 | 1355 | | |
1347 | 1356 | | |
1348 | 1357 | | |
| 1358 | + | |
1349 | 1359 | | |
1350 | 1360 | | |
1351 | 1361 | | |
| |||
0 commit comments