Why point compression shouldn't scare you

Summary and motivation

The most widely used method to represent ECC public keys represents the key by directly encoding the coordinates of the public key point as-is. However, there’s an alternative method that not only uses less data, but also comes with a number of security benefits (almost) for free.

The “compressed point” convention was encumbered by (dubious) patents for the better part of the past two decades, which has unfortunately severely limited widespread adoption of these ideas. Having said that, the last of these patents expired several years ago, so now we’re definitely free to use point compression as we please.

Below, I’ll attempt to argue why you should be using point compression whenever you can. Spoiler alert: it has very little to do with bandwidth optimisation or performance gains.

Background

An ECC public/private key pair consists of the following data:

  • a previously agreed-upon elliptic curve EE over some (finite) field 𝕂\mathbb{K};
  • a base point GEG\in E that generates a large subgroup of E(𝕂)E(\mathbb{K}) of prime order;
  • a public point (often denoted by QQ);
  • a private number (often denoted by dd).

The public and private parts are related by Q=dGQ=dG. As such, the security of ECC hinges on the difficulty of solving the discrete logarithm problem in the subgroup generated by GG.

For the purposes of this post, we’ll assume that 𝕂=𝔽p\mathbb{K}=\mathbb{F}_p for some prime p>3p>3. The techniques we discuss apply to more general fields as well, but that comes at the cost of some additional complexity. Besides, prime fields cover the vast majority of ECC applications in common use. If you want to know how the scheme works in characteristic 2, have a look at SEC 1, secs. 2.3.3, 2.3.4. Most of the time, we’ll also assume that GG generates all of E(𝕂)E(\mathbb{K}) (or, in other words, that the cofactor is one). I’ll point out when that is relevant. Throughout, we identify 𝔽p\mathbb{F}_p with the ring of residue classes /p\mathbb{Z}/p\mathbb{Z}.

Since elliptic curves live in the plane 𝔽p2\mathbb{F}_p^2, the public point QQ is given by two coordinates (x,y)(x,y), each of which is a number in 𝔽p\mathbb{F}_p. The most straightforward (and still most common) way to represent the public key QQ is by serialising both of these. However, this representation actually has a lot of redundancy. We’ll explore why that is the case in the next section.

Point compression over prime fields in a nutshell

The encoding procedure

Let’s assume that our elliptic curve EE is given by the equation y2=x3+ax+by^2 = x^3 + ax + b. Here, aa and bb are elements in 𝔽p\mathbb{F}_p. Note that such a rewriting is always possible if pp is a prime greater than 33.

Suppose now that Q=(x1,y1)Q=(x_1, y_1) is a point on the curve EE that is not the point at infinity. Then x1x_1 and y1y_1 are related as follows: y12=x13+ax1+by_1^2 = x_1^3 + a x_1 + b. In other words, y1y_1 must be a square root of x13+ax1+bx_1^3 + a x_1 + b in 𝔽p\mathbb{F}_p. Recall that square roots in fields are unique up to sign 1. In particular, it follows that to specify QQ, we only need x1x_1 and the parity2 of y1y_1.

In practice curve points are encoded in compressed form as follows.

  1. Set LL to the bit length of pp, and allocate an output buffer of length L+1L+1.
  2. Let Q=(x1,y1)Q=(x_1,y_1) be a point on EE that is not the point at infinity. We identify x1x_1 and y1y_1 with their integer representatives in [0,p1][0, p-1].
  3. If y1y_1 is even, set the first byte of the result to 0x02. If it is odd, set it to 0x03.
  4. Encode x1x_1 into LL bytes in big-endian form, and fill the rest of the output buffer with the resulting bytes.

That’s all very straightforward. The decompression procedure is slightly more interesting.

Decoding compressed points

As we noted in the previous paragraph, recovering the yy-coordinate of a curve point given the xx-coordinate and parity bit amounts to taking a square root in 𝔽p\mathbb{F}_p. However, in real-world data, it may happen that the supplied coordinate x1x_1 is such that x13+x1a+bx_1^3+x_1a+b is not a quadratic residue in 𝔽p\mathbb{F}_p! That means that x1x_1 cannot possibly be the xx-coordinate of any curve point, and hence the input must be rejected.

After taking that into account, the decoding procedure is not hard to describe.

  1. Verify that the input starts with 0x02 or 0x03. If not, reject the input 3.
  2. Introduce an auxiliary variable ss, the value of which is 00 if the first byte of the input is 0x02, and 11 otherwise.
  3. Decode the remaining LL bytes of the input into an integer xx, using big-endian encoding. If x[0,p1]x\notin [0, p-1], reject the input.
  4. Attempt to compute a square root of x3+xa+bx^3+xa+b modulo pp, and denote the result by yy'. If no square root is found, reject the input.
  5. If s=y(mod2)s=y'\pmod{2}, then output (x,y)(x,y'). Otherwise output (x,py)(x,p-y').

Observe that, by construction, (x,y)(x,y') and (x,py)(x,p-y') are always valid curve points.

Security implications

The decoding procedure already illustrates one of the fundamental advantages of using compressed points: provided that the modular arithmetic is implemented correctly, it’s impossible to accidentally admit points that don’t lie on the curve!

Many real-world implementations of ECC-based algorithms don’t bother to verify whether the coordinates they ingest actually correspond to valid curve points. This is very problematic, and a source of dangerous vulnerabilities; see e.g. BMM00. Using point compression gets around this issue by forcing the implementation to compute one of the coordinates using the curve parameters, ensuring that the output is always part of the relevant curve by construction.

Of course, in order for a point to be a valid public key, it must additionally lie in the subgroup of E(𝔽p)E(\mathbb{F}_p) generated by the chosen base point GG. As it happens, this is one of the reasons why many ECC standards choose their curve parameters such that GG generates the entire group: it absolves implementations of the duty to validate whether or not a point lands in the correct subgroup of E(𝔽p)E(\mathbb{F}_p) 4. The security impact of this kind of checks can be subtle; see e.g. BCJZ20, secs. 4.1.3, 5.4 for an analysis specific to Ed25519 5.

Wrapping up

Let’s summarise the main points.

  • Any potentially relevant patents on point compression expired years ago.
  • Point compression and decompression is not hard to implement.
  • Using point compression gets you some extra security, basically for free.

So, if you’re in a position where you don’t have to worry about legacy systems that don’t understand compressed points: please use them!

While RFC 5480 allows both compressed and uncompressed points in certificate public key info data, support for compressed points isn’t a requirement of the specification—presumably because of perceived patent issues. As such, if you’re dealing with legacy clients or public workflows, you might not be able to get away with using compressed points everywhere…

Bibliography

BCJZ20
J. BRENDEL, C. CREMERS, D. JACKSON and M. ZHAO. The Provable Security of Ed25519: Theory and Practice [online]. 2020. Available from: https://eprint.iacr.org/2020/823
BMM00
I. BIEHL, B. MEYER and V. MÜLLER. Differential Fault Attacks on Elliptic Curve Cryptosystems. In : Advances in Cryptology - CRYPTO 2000, 20th Annual International Cryptology Conference, Santa Barbara, California, USA. 2000. DOI 10.1007/3-540-44598-6_8.
RFC 5480
INTERNET ENGINEERING TASK FORCE (IETF). RFC 5480: Elliptic Curve Cryptography Subject Public Key Information [online]. 2009. Available from: https://tools.ietf.org/html/rfc5480.html
SEC 1
D. BROWN. SEC 1: Elliptic Curve Cryptography [online]. 2009. Available from: https://www.secg.org/sec1-v2.pdf

  1. In case you don’t remember why this is the case, here’s a quick proof. Let 𝕂\mathbb{K} be a field, a𝕂a\in\mathbb{K} an arbitrary element and suppose that a2=a2a^2=a'^2. Then 0=a2a2=(aa)(a+a)0=a^2-a'^2=(a-a')(a+a'). Since fields do not have zero divisors, it follows that either a=aa=a' or a=aa=-a'.↩︎

  2. There’s some abuse of language here: what we’re really asking for is the parity of the unique integer representative zz in the residue class of y1y_1 such that 0z<p0\leq z<p. This makes sense, because z=pz(modp)-z = p - z \pmod{p}, so pzp-z is the unique such representative of y1-y_1, and has opposite parity.↩︎

  3. If you’re also interested in parsing uncompressed points (e.g. for compatibility with systems that don’t use point compression), then you should check for the marker 0x04; see SEC 1, sec. 2.3.4.↩︎

  4. For some algorithms that work with curve parameters where the cofactor is nontrivial, there exist more sophisticated point compression procedures that are also capable of guaranteeing that the output still lands in the correct subgroup. Ristretto255 falls into this category.↩︎

  5. Strictly speaking, the point compression scheme used in (standard) Ed25519 implementations is slightly different from the one discussed here, but the broader point stands.↩︎