What does "blob" in Github correspond to?
Asked Answered
A

3

9

The word following "blob" in below URL points to the "master" branch of given repository:

https://github.com/celery/celery/blob/master/docs/django/first-steps-with-django.rst

As per the above convention, what does the following URL point to?

https://github.com/celery/celery/blob/241d2e8ca85a87a2a6d01380d56eb230310868e3/docs/django/first-steps-with-django.rst

I was reading the latest documentation of celery and wanted to see its source on Github, thus the question. Please note that I can view the source code of master documentation by going to "master" branch.

Amorita answered 26/11, 2019 at 9:15 Comment(7)
This should help: git-scm.com/book/en/v2/Git-Internals-Git-ObjectsDuckboard
I.e. it points to a specific commit.Duckboard
Why'd someone like to point their documentation to a commit instead of a branch or tag?Amorita
To make sure the content (wording) and formatting (line numbering, e.g.) will never change so any comment accompanying the link (for example, "See the code at line 5") will always be correct.Anglesite
What's the standard for this? For ex, see the documentation of elasticsearch-py elasticsearch-py.readthedocs.io/en/latest . Their documentation points to master branchAmorita
@FelixKling , I can't find the above commit in celery commit log : github.com/celery/celery/…Amorita
Pull requests are not the same as commits. Here it is: github.com/celery/celery/commit/… . "What's the standard for this?" There is no standard. Everyone can do as they want.Duckboard
S
11

This is really more a question about GitHub than it is about Git.

Remember, Git itself is all about commits. Each commit stores some data—a snapshot of a set of files—and some metadata, including stuff like who made the commit, when, and why. Each of these commits is uniquely identified by its hash ID. Branch and tag names, if any such names exist, merely serve to find some particular hash ID to get you—or Git—started as one of the metadata items in any commit is a list of parent hash IDs, so that Git can start at the last commit and work backwards.

Commits, with their stored data and metadata, are the reason Git exists. Each Git repository is a collection of commits, plus some ancillary data to help find commits. (A non-bare repository on your computer also provides you with a work-area in which you can do new work, but the commits and ancillary data, which don't let you do new work here, are the bare minimum.)

GitHub, on the other hand, is not about commits. GitHub is about sharing.1 This sharing uses (bare) Git repositories, but adds a ton more stuff on top of that. The Git repositories—or some kind of repository anyway2—are necessary to this, but are not the added-value part.

As GitHub try to increase their added value, they start adding things like: Here's a convenient way to access one particular file within one particular commit. Your interface to GitHub is an API, and that API is encoded via HTTP/HTTPS. That means URLs and JSON and so on.

In this case, GitHub have invented some particular URL paths (see the anatomy of a URL) that can refer to a file within a commit. They have provided one way to use a commit hash ID plus a file-path-within-commit to access that file in that specific commit, and another way to use a branch name (such as master) plus a file-path-within-commit to access that file in the commit identified by that branch name.

To do this in Git, you'd normally just git checkout the branch name—which puts the entire commit into your work-tree—and then look at the file by its OS-level path, which is derived from its in-Git-commit path.3 But perhaps your question is: How can I view one file from one commit identified by branch name? In which case, try git show:

git show master:path/to/File.ext

will let you view the file stored under that name (path/to/file.ext) from that commit (whatever hash ID the name master resolves to).


1Sharing and and archival (off-site storage). Two! Our two principle weapons are...

2Remember that Bitbucket was once a Mercurial repository sharing site. It held Hg repos, not Git repos. Perhaps someday GitHub will hold some other kind of repository.

3The OS-level path might differ from the in-Git path in several ways. For instance, on a typical Windows system, filename case (upper or lower case) is only half-respected, so a Git file named path/to/File.ext might reside in a Windows OS file system under path/TO/file.EXT. A typical MacOS file system enforces certain decomposition rules on UTF-8 strings, so MacOS might change a Git file's path as well. Linux tends not to interpret UTF-8, so if Git uses an invalid UTF-8 byte-sequence as a file path name, Linux has no issues at all here.

Sukkah answered 26/11, 2019 at 17:0 Comment(0)
R
2

remembering the SHA-1 key for each version of your file isn’t practical; plus, you aren’t storing the filename in your system — just the content. This object type is called a blob.

You can have Git tell you the object type of any object in Git, given its SHA-1 key,:

$ git cat-file -t 1f7a7a472abf3dd9643fd615f6da379c4acb3e3a
blob

https://git-scm.com/book/en/v2/Git-Internals-Git-Objects

https://git-scm.com/book/en/v2/images/data-model-1.png

Rochelle answered 6/10, 2021 at 11:27 Comment(0)
M
0

The code after "blob" is the SHA of the specified commit, it can be viewed from commits of the file.

The file in master branch will be updated time by time, with this unique code we can make sure to view the same version of the file.

Medeah answered 17/10, 2022 at 7:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.