Split cold and warm index writes#3618
Conversation
…locks for evicting series from in-mem segments
| // the active block. We ensure that can't happen by always writing the cold index in-mem data | ||
| // directly. | ||
| warmBatch := index.NewWriteBatch(batch.Options()) | ||
| coldBatch := index.NewWriteBatch(batch.Options()) |
There was a problem hiding this comment.
This is the simplest implementation for splitting. Assuming this approach works I can fix this up to avoid needless allocs (e.g. if no cold writes, just use the original batch instead of newing two more, etc.)
There was a problem hiding this comment.
Since there's only ever a single goroutine at a time ever calling writeBatchForBlockStart(...) can we actually just use a lock (like a new "writeBatchesLock") and then always reuse the same warm batches and cold batches struct and reset at the start of each write batch call? (not sure if it's reuseable but should technically be trivial to make reuseable).
| // NB: users are expected to use `NewEntry` to construct these objects. | ||
| type Entry struct { | ||
| relookupAndIncrementReaderWriterCount func() (index.OnIndexSeries, bool) | ||
| queryableBlockRetriever series.QueryableBlockRetriever |
There was a problem hiding this comment.
Note @robskillington the way I'm exposing the way to get "block states" which we check against for RequiresColdFlushForBlockStart, is by having the Entry have a pointer to the dbShard which implements this series.QueryableBlockRetriever type. Let me know of any thoughts here.
| // Don't pass stats back from insertion into a cold block, | ||
| // we only care about warm mutable segments stats. |
There was a problem hiding this comment.
Hm, can't we make that decision in the caller not to use the stats? Or does the "Combine()" method do it automatically? Could we make it a "CombineWithoutStats()" so that at least this doesn't respond with just an empty datastructure that we don't really know should be empty or not from the viewpoint of the caller?
Codecov Report✅ All modified and coverable lines are covered by tests.
Additional details and impacted files@@ Coverage Diff @@
## master #3618 +/- ##
=========================================
- Coverage 57.3% 46.0% -11.4%
=========================================
Files 551 213 -338
Lines 64021 19078 -44943
=========================================
- Hits 36712 8784 -27928
+ Misses 24115 9608 -14507
+ Partials 3194 686 -2508
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report in Codecov by Harness.
🚀 New features to boost your workflow:
|
| if !ok { | ||
| return false | ||
| } | ||
| coldBlocks := entry.Series.ColdFlushBlockStarts(v) |
There was a problem hiding this comment.
This method unfortunately doesn't quite do what I think you might think it does, it only returns cold blocks that "need to be flushed":
https://github.com/m3db/m3/blob/2e48cd9759e228a92114eaf2b20d28e7f1ee19aa/src/dbnode/storage/series/buffer.go#L436:L445
I would go and implement a new method on src/dbnode/storage/series/series.go which calls src/dbnode/storage/series/buffer.go and returns True if a block start has a cold bucket that's not empty.
i.e. something like:
func (b *dbBuffer) ColdWritesAtBlockStartExist(blockStart xtime.UnixNano) bool {
bucketVersions, ok := b.bucketsMap[blockStart]
if !ok {
return false
}
for _, bucket := range bucketVersions.buckets {
if bucket.writeType == ColdWrite && bucket.streamsLen() > 0 {
return true
}
}
return false
}
No description provided.