Binary-equivalent instructions produce different origination operations and storage size
Values of some types can be represented in multiple formats. For example, values of the address type can be represented as strings (e. g. "tz1XyaS3pWSQHrKxvnJ9Kmpk9PKgWWunYXzU") or bytes (e. g. 0x00008753e875969a309b1466a05735a8698f8349587e). Usually it doesn't matter which format is used in smart contract code or anywhere else. Raw packed data is independent of this data and it can demonstrated by this example:
tezos-client hash data '{ PUSH address "tz1XyaS3pWSQHrKxvnJ9Kmpk9PKgWWunYXzU"; DROP }' of type 'lambda unit unit'
Raw packed data: 0x0502000000210743036e0a0000001600008753e875969a309b1466a05735a8698f8349587e0320
tezos-client hash data '{ PUSH address 0x00008753e875969a309b1466a05735a8698f8349587e; DROP }' of type 'lambda unit unit'
Raw packed data: 0x0502000000210743036e0a0000001600008753e875969a309b1466a05735a8698f8349587e0320
However, sometimes this difference matters. For example, if there are two smart contracts that differ only in the PUSH address instruction (one uses a string literal and the other one uses bytes), their origination operations will have different sizes and different amount of tez will be burnt. It can be seen in this example:
-
https://you.better-call.dev/carthagenet/KT1Smis81iJPURNe9GqoKfDVk2Z3Pxg18w6k/operations – pushes
0x00008753e875969a309b1466a05735a8698f8349587e, paid storage diff 117. -
https://you.better-call.dev/carthagenet/KT1A1ffB22vRm5TSnUK7er9qLyyAWcKfonSp/operations – pushes
"tz1XyaS3pWSQHrKxvnJ9Kmpk9PKgWWunYXzU", paid storage diff 131.
I wonder if it's intended behavior. It complicates development of software around Tezos. My example with Better Call Dev block explorer demonstrates it: if you open the "code" tab you will that code doesn't differ at all. So if you don't know the details provided here, you will see two contract with the same code and initial storage, but different storage size. As a user I would be very confused if I saw it. As a developer I have to distinguish addresses in "tz1XyaS3pWSQHrKxvnJ9Kmpk9PKgWWunYXzU" and 0x00008753e875969a309b1466a05735a8698f8349587e formats and I can't do unpack (pack x) to get the same x anymore because addresses in both formats are packed to the same bytestring.
Apparently the same applies to some other types, e. g. timestamp. I've made a similar test with PUSH timestamp "1970-01-01T00:00:00Z" and PUSH timestamp 0;.
I think it would be much better if the most efficient representation always appeared in the blockchain. So if I have PUSH address "tz1XyaS3pWSQHrKxvnJ9Kmpk9PKgWWunYXzU"; in my smart contract, it would be treated as PUSH address 0x00008753e875969a309b1466a05735a8698f8349587e; under the hood (because that's a more efficient representation).