Setup Link to heading
Most cloud providers allow you to set commands to run when a host comes online. The general format for this is
called cloud-init
or user-data
. There’s full documentation for cloud-init here.
In my particular case, I had extra files I wanted to write to the host using pulumi.
To make it all fit, I was gzip compressing the files then base64 encoding
the result.
Example code Link to heading
Example code to do this in Python is:
import gzip
import base64
def gzip_string(input_string):
compressed = gzip.compress(input_string.encode())
compressed_b64 = base64.b64encode(compressed)
return compressed_b64.decode()
original_string = """
# cloud-config
"""
instance_userdata = gzip_string(original_string)
print(instance_userdata)
Instance metadata may look something like this:
H4sIABu242YC/+NSVkjOyS9N0U3Oz0vLTOcCAIev2ZQQAAAA
The problem Link to heading
Everything worked, but every time I ran pulumi up
I saw state drift in the
instance metadata. It was base64+gzip, which made it difficult to understand why.
However, going back to the sample code above if you wait … 1 second, and run the code again, you’ll get a different output.
H4sIAFK242YC/+NSVkjOyS9N0U3Oz0vLTOcCAIev2ZQQAAAA
It’s subtle, but H4sIAB
became H4sIAFK
.
This is because gzip has a header (MTIME) that includes the time the file was compressed. From the RFC:
+---+---+---+---+---+---+---+---+---+---+
|ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->)
+---+---+---+---+---+---+---+---+---+---+
Python’s gzip.compress defaults mtime to the current time, but
has a mtime
parameter that you can set to zero to avoid this.
def gzip_string(input_string):
compressed = gzip.compress(input_string.encode(), mtime=0)
compressed_b64 = base64.b64encode(compressed)
return compressed_b64.decode()
Changing my pulumi to default mtime=0
removed the state drift.
Other IaC tools Link to heading
Terraform has a base64gzip function that does this for you code here
var b bytes.Buffer
gz := gzip.NewWriter(&b)
if _, err := gz.Write([]byte(s)); err != nil {
return ...
}
if err := gz.Flush(); err != nil {
return ...
}
if err := gz.Close(); err != nil {
return ...
}
return cty.StringVal(base64.StdEncoding.EncodeToString(b.Bytes())), nil
Reasonably, the config for gzip.NewWriter
defaults to mtime=0
because empty structs are zero for Go,
so terraform users don’t encounter this when using base64gzip
.