Workflow for static website with large binary assets
Asked Answered
E

1

10

I'm maintaining a semi-large web site for my company (a couple hundred pages). This is a static site, with tons of HTML written (i.e., copied & pasted) by hand and binary assets scattered all over the place. These assets include product images, simulation videos, tutorial videos, firmware files, manuals, etc. which change only rarely. Ideally, they would all be stored in one or a few systems where they could be systematically searched and retrieved. Alas, our world isn't ideal and this is not the case. That's why the previous developer put copies of all these files in the site's file structure along with the code. His workflow was to have a copy of the entire site on his PC to make and test changes, then upload them to the web server over FTP. There was no version control.

When I took over, I wanted to introduce version control, so I put the entire thing in a git repository hosted on Azure DevOps. I made use of LFS for most binary files. The entire repository is now about 10 GB in size (including the LFS objects). There is a deployment pipeline which just clones the repo and uploads the entire thing via FTP.

Recently, my company introduced an on premises GitLab installation and I talked to them about migrating the repository there. However, they don't support LFS for now and insist that my workflow is not the way git is meant to be used. Leaving aside the fact that I find their reasoning to be too dogmatic (Large binaries aren't supposed to be in git, LFS notwithstanding. If they are, you're doing it wrong.), I don't dispute that my workflow leaves a lot of room for improvement.

They're suggesting to put all of the binary assets in an external storage solution (e.g., Sharepoint) and have a deployment job in GitLab pull them when preparing a new of the web site.

This brings me to my actual question. Given these circumstances:

  • Static site, maintained by hand.
  • Binary assets at this time not available from any central source.
  • The assets are updated only very rarely.

Would it be an improvement to follow the GitLab admin's advice? Would you foresee any benefits to me as the site maintainer? If binary assets are no longer part of the repository, is there a way to keep track of asset versions as they relate to repository history?

I'm hoping this question is concrete enough not be a simple matter of opinion.

Elemi answered 24/2, 2022 at 14:5 Comment(1)
what about S3 buckets ? support of static sites as well as versioningImprecation
P
3

They're suggesting to put all of the binary assets in an external storage solution (e.g., Sharepoint) and have a deployment job in GitLab pull them when preparing a new of the web site.

Actually, the usual solution is to put them in an artifact referential, made to store binaries. (Nexus or Artifactory)

What you are versioning is a pom.xml (for instance) declaring what version of your static asset binaries you need.

The deployment becomes:

  • git restore from a bare repository (quick, because less files, smaller)
  • unzip of an archive from the artifact referential (with the right tree structure)
Pteridology answered 8/4, 2022 at 7:10 Comment(2)
Thank you for that answer. At my employer we do actually have Artifactory so that might be an option. I've never used it, though, and don't know anything about it. I've done a little research. Aside from repositories specific to particular package management solutions (e.g. NPM, docker images, etc.) Artifactory also has "generic" repositories which I could use for my purposes. Is that what you mean? Would you say I should have each one as a separate "package" in the repository or bundle all of them into a single archive?Elemi
@Elemi Generally, if the assets are tightly coupled (meaning they don't make sense in isolation, but are meant to be deployed as a whole), you could have one giant archive as a generic artifact in Artifactory. I might consider isolating a few artifacts that might still need to change version from time to time, as it would avoid having to store a new version of the giant archive, and only updating those smaller artifacts, leaving the large one for immutable resources.Pteridology

© 2022 - 2024 — McMap. All rights reserved.