Manifest discussions¶
Day 1¶
- ‘Schema’ is what is exchanged on the wire protocol. It’s not been broken in the past, so that old clients can still work with it.
- Tree manifests: It’s possible to calculate both hashes (tree hash and flat manifest hashes).
- Mozilla central needs 25MB of mapping from old to new hash scheme for both changeset and manifest
- Having the mapping might cause people to reimplement
git alias
(spectral note: I don’t know what this is ;)) - Two switches: new manifest format, and new manifest hash
- Mozilla can use the new format (trees, manifest v2) without a new hash
- While we’re changing the ‘schema’ (hashes), what do we want from changeset v2?
- “extra” key/value pairs on individual lines
- Support n in filenames
- Rename information?
- add/edit/delete status in changesest (no need to touch manifest)?
- More strict author field validation (require email/rfc format?)
hg log
on directories becomes faster with tree manifestsrename cache
to store that a file has not been renamed, so that a lot of checks become much quicker- Historical note: the reason for the file list in the changeset is that it’s for push/pull, so deletes didn’t originally show up there, because it didn’t change the filelog.
- sid0: do we want to store ‘this got deleted’ information in the filelogs, so
that
hg log <file>
shows that it happened? - Default is ‘flat manifests’ since gut-feel is ~98% of projects this is the best one for them
- Certain projects want tree manifests on disk (client? or server? or both (separately? concurrently?))
- Could make a read-only copy using old hash to do a more gradual migration to exchanging tree manifests?
Three use cases:
- Mozilla central today: flag to turn on that uses a new disk format, but no hash changes, so exchange is unaffected.
- Google soon: start from scratch with new hashes
- Transition from 1->2
Two flags:
- Storing tree manifests locally (old hashing)
- Break the schema
For flag #1 without #2: The manifest revlog (root-level 00manifest.{i,d}) would have the old hash as its nodeid, and it wouldn’t strictly match the contents at that version.
An extension (client and server side) that can maintain a map for old-hashes in bug trackers?
For getting to Flag #2:
- Default on the server is that it does not accept manifest v2
- no v1 children with v2 parents
- Server then enables v2 pushes to it, the next change with v2 will upgrade all future changes
- Upgrade during exchange v1->v2? Maybe not needed?
- Command to downgrade from v1->v2 if you get ‘infected’ with the virus should be pretty easy.
- flat-hashing a tree manifest would be more difficult than it might seem at first, because parent revisions
- A new challenger appears! (4th use case?)
- Matrix: flat-right-now vs. flat-with-subdir-hashes vs. tree manifests,
manifestv1 vs. manifestv2, hashv1 vs. hashv2
- Are deltas going to be broken in any of these?
- Manifest Feature Maxtrix
- So we’re thinking implement 6, 8/9, 14 on the way to 17, benchmark them,
see if the benefits make it so that implementing the
conversion-during-exchange makes sense.
- benchmarks need to consider clone time, server cpu usage, on-disk size
- 6=14 and 8=17 if we don’t care about breaking hashes, 8=9 if we don’t care about exchange
- Client version announcement (User-Agent string?)
- As a ‘backport extension’?
- Include hg version, extensions? python version? platform?
Day 2¶
Google wants new tree-structure manifests.
It’d be nice to not break old clients. Can compute old format hash for tree manifest on disk.
Three use cases:
- mozilla-central today
- Google soon
- Never accept v1 manifests, ever.
- Transition from 1 to 2 case
- (~2 years out, needs time for clients to upgrade naturally)
- prevention use case
- Implementation-wise, this really means you don’t set the schema change flag on the server.
- Idea: server could rewrite as v1 when receiving push using v2, tell client (using bundle2)
Two flags:
- Store tree manifests locally but use old hashing
- Transcode to old manifest format over the wire
- store old hash in the changelog entry
- Break the schema
- allow new hashing scheme to be recorded in changelog
- exchange the new revlogs
MAY enforce a changeset schema change when we do flag 2? Not sure if it really matters.
Layout v2: orthogonal from all of these concerns?
- Puts file hashes on separate lines for compression benefits