Git

gitformat-commit-graph 最後更新於 2.47.0

名稱

gitformat-commit-graph - Git commit-graph 格式

概要

$GIT_DIR/objects/info/commit-graph
$GIT_DIR/objects/info/commit-graphs/*

描述

Git commit-graph 儲存 commit OID 的列表以及一些相關的 metadata，包括：

commit 的世代號碼。
根樹 OID。
commit 日期。
commit 的父 commit，使用在圖形檔案中的位置參考來儲存。
commit 的 Bloom Filter，若有要求，會攜帶 commit 與其第一個父 commit 之間變更的路徑。

這些位置參考會以無號 32 位元整數儲存，對應於 commit OID 列表中的陣列位置。由於我們使用一些特殊常數來追蹤父 commit，我們最多可以儲存 (1 << 30) + (1 << 29) + (1 << 28) - 1 (約 18 億) 個 commit。

Commit-graph 檔案具有以下格式

為了允許擴展，將額外資料新增至圖形，我們將主體組織成「區塊」並在主體開頭提供二進位查詢表。標頭包含某些值，例如區塊數量和雜湊類型。

所有多位元組數字皆為網路位元組序。

標頭

4-byte signature:
    The signature is: {'C', 'G', 'P', 'H'}

1-byte version number:
    Currently, the only valid version is 1.

 1-byte Hash Version
     We infer the hash length (H) from this value:
1 => SHA-1
2 => SHA-256
     If the hash type does not match the repository's hash algorithm, the
     commit-graph file should be ignored with a warning presented to the
     user.

1-byte number (C) of "chunks"

1-byte number (B) of base commit-graphs
    We infer the length (H*B) of the Base Graphs chunk
    from this value.

區塊查詢

(C + 1) * 12 bytes listing the table of contents for the chunks:
    First 4 bytes describe the chunk id. Value 0 is a terminating label.
    Other 8 bytes provide the byte-offset in current file for chunk to
    start. (Chunks are ordered contiguously in the file, so you can infer
    the length using the next chunk position if necessary.) Each chunk
    ID appears at most once.

The CHUNK LOOKUP matches the table of contents from
the chunk-based file format, see gitformat-chunk[5]

The remaining data in the body is described one chunk at a time, and
these chunks may be given in any order. Chunks are required unless
otherwise specified.

區塊資料

OID Fanout (ID: {O, I, D, F}) (256 * 4 個位元組)

The ith entry, F[i], stores the number of OIDs with first
byte at most i. Thus F[255] stores the total
number of commits (N).

OID 查詢 (ID: {O, I, D, L}) (N * H 個位元組)

The OIDs for all commits in the graph, sorted in ascending order.

Commit 資料 (ID: {C, D, A, T }) (N * (H + 16) 個位元組)

前 H 個位元組是根樹的 OID。
接下來的 8 個位元組是第 i 個 commit 的前兩個父 commit 的位置。如果該位置沒有父 commit，則儲存值 0x70000000。如果有超過兩個父 commit，則第二個值會將其最高有效位元設為開啟，而其他位元則儲存至 Extra Edge List 區塊的陣列位置。
接下來的 8 個位元組儲存 commit 的拓樸層級 (世代號碼 v1) 和自 EPOCH 以來的 commit 時間 (以秒為單位)。世代號碼使用前 4 個位元組的較高 30 位元，而 commit 時間使用第二個 4 個位元組的 32 位元，以及最低位元組的最低 2 位元，儲存 commit 時間的第 33 和 34 位元。

世代資料 (ID: {G, D, A, 2 }) (N * 4 個位元組) [選用]

此 4 位元組值列表儲存 commit 的更正 commit 日期偏移，以與 commit 資料區塊相同的順序排列。
如果無法在 31 位元內儲存更正的 commit 日期偏移，則該值會將其最高有效位元設為開啟，而其他位元則會將更正的 commit 日期位置儲存至世代資料溢位區塊。
只有在相容的 Git 版本撰寫 commit-graph 檔案時，且在分割 commit-graph 鏈的情況下，最頂層也具有世代資料區塊時，才會出現世代資料區塊。

世代資料溢位 (ID: {G, D, O, 2 }) [選用]

此 8 位元組值列表儲存無法在 31 位元內儲存的 commit 更正 commit 日期偏移。
只有在出現世代資料區塊，且至少有一個更正的 commit 日期偏移無法在 31 位元內儲存時，才會出現世代資料溢位區塊。

額外邊緣列表 (ID: {E, D, G, E}) [選用]

This list of 4-byte values store the second through nth parents for
all octopus merges. The second parent value in the commit data stores
an array position within this list along with the most-significant bit
on. Starting at that array position, iterate through this list of commit
positions for the parents until reaching a value with the most-significant
bit on. The other bits correspond to the position of the last parent.

Bloom Filter 索引 (ID: {B, I, D, X}) (N * 4 個位元組) [選用]

第 i 個項目 BIDX[i] 儲存從 commit 0 到 commit i (含) 的所有 Bloom Filter 中位元組的數量，依詞彙順序排列。第 i 個 commit 的 Bloom Filter 從 BIDX[i-1] 到 BIDX[i] (加上標頭長度)，其中 BIDX[-1] 為 0。
如果沒有 BDAT 區塊，則忽略 BIDX 區塊。

Bloom Filter 資料 (ID: {B, D, A, T}) [選用]

它以包含三個無號 32 位元整數的標頭開始
- 正在使用的雜湊演算法版本。我們目前支援值 2，其對應於 Murmur3 雜湊的 32 位元版本，其執行方式與 https://en.wikipedia.org/wiki/MurmurHash#Algorithm 中描述的完全相同，並且雙重雜湊技術使用種子值 0x293ae76f 和 0x7e646e2，如 https://doi.org/10.1007/978-3-540-30494-4_26「機率驗證中的 Bloom Filter」所述。當 char 為有號數且存放庫的路徑名稱具有字元 >= 0x80 時，版本 1 Bloom Filter 會出現錯誤；Git 支援讀寫它們，但此功能將在未來的 Git 版本中移除。
- 雜湊路徑的次數，因此也是累計確定檔案是否存在於 commit 中的位元位置數。
- Bloom Filter 中每個項目的最小位元數 b。如果 Filter 包含 n 個項目，則 Filter 大小是包含 n*b 個位元的最小 64 位元字數。
區塊的其餘部分是依詞彙順序排列的所有計算 Bloom Filter 的串連。
注意：沒有變更或變更超過 512 個的 Commit，其 Bloom Filter 長度為 1，且所有位元分別設定為 0 或 1。
當且僅當出現 BIDX 時，才會出現 BDAT 區塊。

基礎圖形列表 (ID: {B, A, S, E}) [選用]

This list of H-byte hashes describe a set of B commit-graph files that
form a commit-graph chain. The graph position for the ith commit in this
file's OID Lookup chunk is equal to i plus the number of commits in all
base graphs.  If B is non-zero, this chunk must exist.

結尾

H-byte HASH-checksum of all of the above.

歷史記錄

世代資料 (GDA2) 和世代資料溢位 (GDO2) 區塊在其區塊 ID 中具有數字 2，因為先前版本的 Git 在這些區塊中使用 ID「GDAT」和「GDOV」撰寫了可能錯誤的資料。透過變更 ID，較新版本的 Git 會自動忽略那些較舊的區塊，並在不信任不正確資料的情況下撰寫新資訊。

屬於 git[1] 套件的一部分

設定與配置

取得與建立專案

基本快照

分支與合併

分享與更新專案

檢查與比較

修補

除錯

電子郵件

外部系統

伺服器管理

指南

管理

底層命令

名稱

概要

描述