Sunday, June 26, 2022

Html: Unwanted cached resources

So as you write a simple a html file with one or two resources, lets say one css and one js file, and you then deploy it (= copy paste) to a sever, and then you make some changes (= fix a bug), and then you redeploy (= copy paste, overwrite popup, press yes), and then nothing changes.

The browser, like always, remembers the old 'main.js' from before and then you pres Ctrl+F5. But your tester in the other side, has no clue what a Ctrl is. So you have to do something.

The first thing is to add a simple '?1' at the end of 'main.js' and every other resource, the browser thinks it may be a different file and deletes the cached version.

So now lets make a script to automate the replacement of the '?1' with '?2':

sed 's/"\(.*\)?.*"/"\1?2"||/' -i index.html

But we may want the browser to actually keep some cached files, the files that have not changed. Only if, instead of '?2', we could generate a number that only changes if the actual file changes and also does not repeat it self.

Hash functions? Checksums? Yes.

Lets use 'md5sum' to freak out the people who don't know. A single sed will not make it, lets write a sed to generate multiple seds and pipe it into a sh.

First we find checksums:

find -regex '.*\(js\|css\)$' -exec md5sum {} \+

Then we pipe that to our first sed:

sed 's/\(.*\)  \(.*\)/ ... \2 ... \1 ... /'

where in the place '...' we reconstruct version of our original sed (gets very unreadable). And finally we pipe it to a sh to actually do the substitution.

The whole thing if placed in Makefile would look like this:

index.html: *.css *.js
    find *.css *.js -exec md5sum {} \+ | \
        sed 's/\(.*\)  \(.*\)/sed \"s|\\"\2?[^\\"]*|\\"\2?\1|\" -i index.html/' | sh