The database consists of two primary tables, “fods” (fixed-output derivations) and “commits”. They have a many-to-many relationship managed by the table “fod_commit_links”. The table “fod_references” contains some more information about each fixed-output derivation (described below). Finally, the table “failed_fetches” is a log for when the “identify” procedure cannot fetch the source of a fixed-output derivation.
The rows of this table represent fixed-output derivations from the Guix package graph.
INTEGER PRIMARY KEY
A unique integer identifying the fixed-output derivation.
TEXT NOT NULL
The algorithm used to compute the hash (e.g., “sha256”).
TEXT NOT NULL
The hash of the source.
TEXT
The Software Heritage ID (SWHID) of the source.
INTEGER
(Boolean)
Whether or not the source is in the Software Heritage
archive. If this is FALSE
, then the source
could not be found in the archive. If this
is NULL
, then it has not been checked.
INTEGER
(Boolean)
Reserved for future use. 🙂
TEXT NOT NULL
The serialized derivation à la write-derivation
.
The rows of this table represent commits in the Guix Git repository.
INTEGER PRIMARY KEY
A unique integer identifying the commit.
TEXT NOT NULL UNIQUE
The commit ID (hash) from Git.
INTEGER
(Unix timestamp)
The time the commit was pushed. This comes from the Guix Data Service, which in turn gets it from the guix-commits mailing list.
A simple table linking fixed-output derivations to commits. Every record is unique (the primary key is both columns).
INTEGER NOT NULL REFERENCES fods (fod_id)
A pointer to a fixed-output derivation.
INTEGER NOT NULL REFERENCES commits (commit_id)
A pointer to a commit.
A simple table attaching “references” to fixed-output derivations. Every record is unique (the primary key is both columns).
INTEGER NOT NULL REFERENCES fods (fod_id)
A pointer to a fixed-output derivation.
TEXT NOT NULL
A serialized form of the “uri” field of the origin
record that corresponds to the fixed-output derivation.
For a plain URL, this will be a serialized
string: "https://..."
. For a Git reference,
it will look like:
(git-reference (url "...") (commit "...") (recursive? #f))
It will be similar for other kinds of references (Subversion, CVS, etc.). Oftentimes the reference will have more information and be easier to use than the serialized derivation, which makes it extremely useful.
This table records failed attempts at identifying (finding the SWHID) of fixed-output derivations. It’s main purpose is to prevent trying the same fixed-output derivation over and over. Every record is unique (the primary key is both columns).
INTEGER NOT NULL REFERENCES fods (fod_id)
A pointer to a fixed-output derivation.
INTEGER
(Unix timestamp)
The time that the identification attempt failed.
CREATE TABLE commits (commit_id INTEGER PRIMARY KEY, hash TEXT NOT NULL UNIQUE, push_time INTEGER); CREATE TABLE fods (fod_id INTEGER PRIMARY KEY, algorithm TEXT NOT NULL, hash TEXT NOT NULL, swhid TEXT, is_in_swh INTEGER, is_in_disarchive INTEGER, derivation TEXT NOT NULL, UNIQUE (hash, algorithm)); CREATE TABLE fod_commit_links (fod_id INTEGER NOT NULL REFERENCES fods (fod_id), commit_id INTEGER NOT NULL REFERENCES commits (commit_id), PRIMARY KEY (fod_id, commit_id)); CREATE INDEX fod_commit_links_reverse_index ON fod_commit_links (commit_id); CREATE TABLE fod_references (fod_id INTEGER NOT NULL REFERENCES fods (fod_id), reference TEXT NOT NULL, PRIMARY KEY (fod_id, reference)); CREATE TABLE failed_fetches (fod_id INTEGER NOT NULL REFERENCES fods (fod_id), fetch_time INTEGER NOT NULL, PRIMARY KEY (fod_id, fetch_time));