xkcd.WTF!?

Image loading failed. try again

Lungfish

I know having so many base pairs makes rebasing complicated, but you're in Bilateria, so shouldn't you at LEAST be better at using git head?

Explanation

Lungfish have the largest known genome among the vertebrates (133 billion base pairs), and the third-largest known genome of all species. The comic relates this to a common issue when editing documents or coding, where the author accidentally makes changes to two separately created versions of documents, when they meant to only edit one, which can result in changes to both (or all) resulting documents becoming functionally essential parts of the completed project, or at least present as development artifacts in the 'final' product. This may happen if documents are sent for review (or updating) to different editors, or at different times, and the changes from the earlier one(s) aren't properly integrated with the later one(s). The comic posits that Lungfish has a habit of doing this with its own genome, making both genes essential and increasing the amount of base pairs.

When Cueball confronts Lungfish about this bad habit of mismanaging its genome, Lungfish dismisses him by saying he'll just "buy more storage". This is likely alluding to when people are faced with an increasing number of files on their storage media, they just buy more storage, either by adding another media drive or paying additional monthly fees for online storage (ex: iCloud or Google Drive). Because of the relatively low cost of storage, this often seems like any easy 'solution', but doesn't actually address the problems of information fragmentation and management. As well as being an issue in their own right, a failure to deal with these can lead to a repeating pattern that ratchets up storage cost over time. If part of the process is to buy 'fresh' storage space, perhaps to attempt to rationalise the new and historic files from where they were previously, an even worse legacy of 'temporary' copies (or near-copies) of old files may end up littering various layers of storage, in ways that may later confuse matters further.

The lungfish's file names Copy of Copy of Gene v3 (Newest) (2) and Copy of Copy of Gene v3 (Final) (2) strongly hint at a very poorly organized method of version tracking, as especially evidenced by both existing as 'current' versions, with a tendency of the lungfish to 'copy-paste' inconsistently, certainly explaining why it keeps editing and maintaining multiple genes instead of a single one. These inconsistencies are usually caused by:

"Copy of"
Older versions of Windows, when copy-and-pasting a file within the same folder, would automatically add "Copy of" to the beginning of the filename, resulting in a file named "Copy of x". If the resulting file were then copied, it would be likewise prepended, thus producing "Copy of Copy of x". Newer versions of Windows instead add "- Copy" to the end of filename, which produces the same effect but keeps things in roughly the same order when sorted by name. Other systems may apply their own conventions that interact badly (or fail to sensibly do so) as duplicated and manually renamed files are accessed across differing systems with differing conventions. This was previously referenced in the title text of 1459: Documents.
"(2)"
Numbered labels in brackets can be produced by a couple different actions:
  • If a file is downloaded from a webpage (such as webmail, or intranet/internet repository) and a file with same name already exists in the download folder, most modern browsers similarly append a number such as "(1)", or other number, as necessary.
  • If a copied file is pasted multiple times into the same folder, it will also receive number labels in the same format. This includes copies that are also appended as such in the above points, so a newly-pasted file might end up with a "Copy of" prefix AND a "(2)" suffix, or the next available value that does not clash with documents currently held in the same location.
  • If multiple files are renamed at once, all of the files will be given the same name with a number label.
"v3"
Some users will keep older and newer drafts of a file in case of a need to revert back to an older version. This can be done with a number label (i.e. "v1", "v2", etc.) or a proper word (i.e. "draft", "edited", etc.). This is often entirely at the user's discretion. If well-defined, this can be useful. If an edit breaks something important or with mistaken saves, this can recover lost data. But it can also lead to file hoarding and revisiting older documents by accident. Such a manual method of attempting to keep sequential versions in line can easily be open to misuse and ambiguity of status.

However, this is not why lungfish have a large genome. While some organisms do contain many copies of genes as a diversification strategy, this mostly occurs only in some plants and single-celled eukaryotes. Lungfish have roughly the same number of genes as a human (and likely slightly fewer), and the large size of the lungfish genome is likely due to poor transposon control causing their chromosomes to fill up with junk.

The title text further compares the biology of lungfish to managing versions of files in a popular version control system called Git. Git includes a symbolic reference called "HEAD" that assists in keeping track of the version of a set of project resources that are being worked on in a particular repository; HEAD is intended to point to the branch in change history where new changes will be recorded. Rebasing, in Git, is the act of editing the lineage of project file changes in order to present a simplified history, instead of accumulating changes as they actually occurred, which may be messy. Cueball says this process is complicated due to the large number of 'base pairs' - a pun since base pairs are elements of genes, while the 'base' in file change history refers to an ancestral version of a file to which changes have been applied. Bilateria is a clade of animals characterized by embryonic bilateral symmetry, giving their bodies distinguishable "head" and "tail" ends. Since this applies to lungfish, Cueball says, in another pun, that the lungfish should at least know how to use the "head" reference in Git.