public class Collector
- extends Object
Responsible for maintaining a consistent view of committed segments. The
collector maintains a union of segments on the file system that make up the
logical whole segment. At any point in time, a subset of the segments under
the "committed" directory comprise the segment data. We'll call these the
live segments. For any given [entry] ID, at most one live segment
contains the corresponding entry [data]. That is, the live segments do not
contain any overlapping IDs. Here's an illustration:
The horizontal axis represents entry IDs, and each block of X's represents a
segment spanning a range of IDs on the file system. The live segments,
together with the union of all [file-backed] delete-sets comprise the
complete logical segment. Typically, there are other segments under the
committed directory that don't contribute to the data model. These are
segments that have already been appended to another segment and have
consequently left the model (see below). We call
a segment on the file system that no longer contributes to the model a
dead segment. Segments and delete-set files are maintained in
unit directories (
unit may contain backing files for either a segment, or a delete-set, or
both. Entry IDs in file-backed delete-sets are periodically applied to
(deleted from) the (live) segments. If all the IDs in a file-backed
delete-set have already been applied to the live segments, then the given
delete-set is considered dead; otherwise, it is considered
live. At any point in time, a unit directory may contain any
combination of live, dead, or no segment, and live, dead, or no delete-set.
That is a unit can even be empty. The collector periodically purges (deletes)
dead segment and dead delete-set files, and if empty, the unit directory
- No two nonempty segments share the same base [entry] ID.
- We only ever append a
segment to its adjacent lower neighbor.
- A segment that is being appended to another is itself never modified.
(Entry IDs in delete-sets are not applied to the
segment while it is being appended to another segment.)
- A segment that covers another segment overrides
that segment. A segment is said to cover another segment if its range of
entry IDs span those of the other, and has more entries than the other
segment. Covered segments are not considered to be part of the data model:
they are ignored and are subject to collection (deletion).
- As a result of the above rule, after a source segment is
appended to a nonempty destination segment, the source segment is
no longer in the data model.
- The only empty segment in the model is the initial empty segment. In a
transactional setting (
SegmentStore.newTransaction()), this empty
segment is never appended to; it leaves the model as soon as a non-empty
segment with the same base ID enters (is moved to) the committed directory.
The collector maintains the following lifecycles for segments, delete-sets,
and the unit directories they live in. Segments are merged and delete-sets
are processed asynchronously. So it's worthwhile to layout some lifecycle
- Added. When a new segment (and it parent
unit directory) is first added, it contains only the
new entries inserted in the committed transaction. Its
base ID is at the
next ID property of the logical
union as it was exactly prior to the commit.
The segment is live. (It's in the data model.)
may be applied to it.
- Appended (optional) Once a segment is added, it may also
be appended to. Segments are periodically merged by
appending one (the source) to
another (the destination). This segment state corresponds to that of the
destination segment after it has been appended to. The segment continues
to be live. (It's in the data model.)
DeleteSets may be
applied to it--even while the merge is underway.
- Overridden (a.k.a. covered) This segment state
corresponds to that of the source segment after it has been the argument of
an append. After the merge has completed, the source segment is no longer in
the model. The segment is logically dead, even though its
backing files still exist.
DeleteSets are not applied to source
segments while the merge is underway.
- Zombie (optional) The just overridden segment is
logically dead, but its backing files are still in use by one or more
is greater than zero. Note that once the segment store is
closed, all zombie segments become
dead the next time the store is
- Dead The overridden segment is dead, but its backing
files have yet to be purged.
- Purged The overridden segment is dead, and its backing
files have been purged.
- Added The delete-set file was added as part of a
committed transaction in which pre-existing entries were deleted.
- Dead The entry IDs listed in the delete-set have been
deleted in (applied to) the relevant segments. The delete-set is now
redundant, but its backing file still exists.
- Purged The applied delete-set's backing file has been
Unit directory lifecycle
- Added Under the collector, a
unit directory begins life by being moved to the
"committed" directory. The directory initially contains only the transaction
data being committed. That is it will either contain a segment of new
entries, a delete-set of pre-existing entries, or both. Once added, the
segment and/or delete-set in the unit follow the course of their own
lifecycles (as described above), until..
- Purgeable Both the segment and delete-set, if any, have
been purged. I.e. there are no files in the unit directory, and it is
eligible for purging.
- Purged The directory has been removed.
- Babak Farhang
- See Also:
An illustration of how the system determines which
segments are live.
|Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
public Collector(File committedDir)
public Collector(File committedDir,
public int nextUnitId()
public File getDirectory()
public Segment getReadOnlyView()
- Returns a read-only view of the committed segments.
public Segment getReadOnlySnapShot()
- Returns a read-only snap shot of the committed segments. Note this
is not a true snap shot; new deletes may occasionally become visible.
What's frozen is the number of entries; new insertions are not
seen by this view.
public void commitUnit(UnitDir unit)
public void close()