CKKS Key switching - Shared Insights

# CKKS Key switching ## Motivation (First try) **First idea,** define $\text{swk}_{s_{in}\rightarrow s_{out}}:=(a^* s_{out}+s_{in}+e^*, a^*) \bmod Q$. This is an encryption of the plaintext $s_{in}$ w.r.t secret key $s_{out}$ modulo Q. Let $ct$ be a [[R-LWE ciphertext (RLWE)]] $ct = (ct_0,ct_1)=(a\cdot s_{in}+m+e,a) \bmod Q$ a ciphertext of the plaintext $m$ w.r.t secret key $s_{in}$ The main idea is to subtract the term $a\cdot s_{in}$ from $a\cdot s_{in}+m+e$ by computing $ct_0 \\ -a\cdot \text{swk}_0$ : $\begin{array}{l} a\cdot s_{in}+m+e \\ -a( a^*s_{out}+s_{in}+e^*) \\ \hline \rule{4cm}{0.4pt} \\ -aa^*s_{out} + m+e-ae^* \end{array}$ defining a new ciphertext $(ct_0-a\cdot swk_0,-aa^*)\bmod Q$ which "encrypts" m w.r.t secret key $s_{out}$. note that as $a$ and $a^*$ are uniformly random than so is their multiplication. #### But this will NOT work! There are two problems: - The switching key is not valid as the noise $e^*$ and the plaintext $s$ have both small norm and so $e*$ will "hide" the plaintext $s$. - The output $ct^\prime$ is NOT a ciphertext of $m$ w.r.t secret key $s$. The problem here is that the error term $a e^*$ does not have a small norm and so $m +e^\prime$ may wrap around the modulo $Q$ i.e., this ciphertext is invalid and may not be decryptable. **Second idea**, choose $P$ s.t. $|P|\approx |Q|$ and divide the noise $ae^*$ by $P$ but not divide the plaintext $m$ define $\text{swk}_{s_{in}\rightarrow s_{out}}:=(a^* s_{out}+P\cdot s_{in}+e^*, a^*) \bmod PQ$ . This corrects the first issue as now the message is $P\cdot s_{in}$ We raise $a \mod Q$ to $a \bmod QP$ (the coefficients of the latter have the same magnitude as the former - raising the modulo does not change the polynomial), compute $a\cdot \text{swk}_{s_{in}\rightarrow s_{out}}$ , divide by $P^{-1}$ and drop the modulo back to $Q$: $P^{-1}\cdot a\cdot \text{swk}_0=P^{-1}aa^*s_{out}+as_{in}+P^{-1}ae^* \bmod Q$ note that the rightmost term has the magnitude of $e^*$ as $|a|<Q$ and $|P|\approx|Q|$. This solves the second issue. define $ct_{s_{in}\rightarrow s_{out}}:=ct_0 - \lfloor P^{-1}\cdot a\cdot \text{swk}_0\rceil\bmod Q$ : The bx part of $ct_{s_{in}\rightarrow s_{out}}$ is: $\begin{array}{l} a\cdot s_{in}+m+e \bmod Q\\ - \lfloor a\cdot P^{-1} ( a^*s_{out}+P\cdot s_{in}+e^*)\rceil\bmod Q\\ \hline \rule{6cm}{0.4pt} \\ -P^{-1}aa^*s_{out} + m+ \underbrace{e-P^{-1}ae^* +e_{\text{round}}}_{\text{small error}}\bmod Q \end{array}$ The "a" (random) part of $ct_{s_{in}\rightarrow s_{out}}$ is $-P^{-1}aa^*$ This ciphertext encrypts the original message $m$ w.r.t secret key $s_{out}$ > Notes: > 1) division by $P$ of an element in ring $PQ$ result in an element with the same value divided by P (with possible rounding) in ring $Q$ e.g., $x=[x]_{15}+15y$ s.t. $[x]_{15}<15$ so $x/5=[x]_{15}/5+3y$ s.t. $[x]_{15}/5<3$ which means that $[x/5]_{3}= [x]_{15}/5$ . > 2) Computing the modulo reduction to $Q_\ell$ requires converting the polynomial to coefficient representation, reduce the modulo of the coefficients and back to NTT > 3) This idea caused a new problem - the encryption of switch key is computed modulo $PQ$ and $|P|\approx|Q|$ and the modulo size is limited for security constraints. This means we have half the levels for computing homomorphic operations. **Third idea**. The problem with the 2nd idea is the half of the levels are taken by special primes and as security limits the number of levels then this reduce the number of effective levels for homomorphic computation. To solve this, the new idea is to reduce the size of $P$ by reducing the norm of $a$ (the uniform random element of a ciphertext $ct$). smaller $a$ means we need smaller $P$ that divides it and make its norm small when reducing the modulo to Q from modulo QP in the computation above. 1. Define switching keys for all $i$ by a set of ciphertexts. This is computed during scheme initialization. $\text{swk}^{(i)}_{s_{in}\rightarrow s_{out}}:=(a^*_i\cdot s_{out}+ \hat{q_i}\cdot P\cdot s_{in}+e^*_i, a^*_i) \bmod PQ$ 2. Decompose $a$ by $\big([a\cdot\hat{q_0}^{-1}]_{q_0},\ldots, [a\cdot\hat{q_L}^{-1}]_{q_L}\big)$ Note by the CRT: $a=\sum_{j=0}^L [a]_{q_j}\cdot[\hat{q}_j^{-1}]_{q_j}\cdot\hat{q}_j=\sum_{j=0}^L [a\hat{q}_j^{-1}]_{q_j}\hat{q}_j \bmod Q = \langle ([a\hat{q}_j^{-1}]_{q_j})_j , (\hat{q}_j)_j\rangle\bmod Q$ where $\hat{q_j}:=\prod_{i\ne j}{q_i}$ and all components have small norm i.e., $\big|[a\cdot\hat{q_i}^{-1}]_{q_i}\big|< q_i$ for $i=0,\ldots,L$ 3. Set $P$ to be a prime of size 64 bits i.e., $P>q_i$ for all $i$ 4. mod-up of the decomposition $i$ of $a$: $[a\cdot\hat{q_i}^{-1}]_{q_i}$ to be modulo $P\cdot Q$ 5. multiply the i-th switch key and the $i$-th decomposition of $a$, we have $\begin{align} \big[[a\cdot\hat{q_i}^{-1}]_{q_i}\big]_{PQ}\cdot \text{swk}^{(i)}_{s_{in}\rightarrow s_{out}} &= [a\cdot\hat{q_i}^{-1}]_{q_i}\cdot \text{swk}^{(i)}_{s_{in}\rightarrow s_{out}} \bmod PQ \\ &= \big([a\cdot\hat{q_i}^{-1}]_{q_i}\cdot a^*_i\cdot s_{out} + [a\cdot \hat{q_i}^{-1}]_{q_i}\cdot \hat{q_i}\cdot P\cdot s_{in}+ e^*_i[a\cdot\hat{q_i}^{-1}]_{q_i} , [a\cdot\hat{q_i}^{-1}]_{q_i} a^*_i\big)\bmod PQ \end{align}$ 7. mod_down to modulo $P$: $\bigg(P^{-1}[a\cdot\hat{q_i}^{-1}]_{q_i}\cdot a^*_i\cdot s_{out} + [a\cdot \hat{q_i}^{-1}]_{q_i}\cdot \hat{q_i}\cdot s_{in}+ P^{-1}\cdot e^*_i[a\cdot\hat{q_i}^{-1}]_{q_i} \ ,P^{-1}\ [a\cdot\hat{q_i}^{-1}]_{q_i} a^*_i\bigg)\bmod Q$ 6. sum all results: $e_i^\dagger:=P^{-1}\cdot e^*_i[a\cdot\hat{q_i}^{-1}]_{q_i}$, $a_i^\dagger:=P^{-1}[a\cdot\hat{q_i}^{-1}]_{q_i}\cdot a^*_i$ > $t = \sum_{i=0}^L \big([a\cdot\hat{q_i}^{-1}]_{q_i}\cdot \text{swk}^{(i)}_{s_{in}\rightarrow s_{out}}\big)= \big(\sum_{i=0}^L a^\dagger_i\cdot s_{out}+ a\cdot s_{in}+ \sum_{i=0}^L e_i^\dagger,\sum_{i=0}^L a^\dagger_i\big)\bmod Q$ 7. subtract from original ciphertext: $ct_0- t=as_{in}+m+e-t=a^{\dagger\dagger}s_{out}+m+e+e^{\dagger\dagger}$ where $a^{\dagger\dagger}:= \sum_{i=0}^L a^{\dagger}_i$ and $e^{\dagger\dagger}:=\sum_{i=0}^L e_i^\dagger$ **Fourth idea**. The problem with the 3rd idea that you need more memory to store the extra switch keys and more computation is being done for each key switch. To balance this, the new idea is to use batches of $q_is, $Q_j:=\prod_{j\alpha}^{j(\alpha+1)} q_i$ for decomposing $a$ to fewer smaller polynomials. 1. Decompose $a$ by $decomp(a):=([a]_{Q_0},\ldots,[a]_{Q_k})$ Note that we can construct $a\bmod Q$ by the [[Chinese Reminder Theorem (CRT)]] formula: $a=\sum_{j=0}^k [a]_{Q_j}\cdot[\hat{Q}_j^{-1}]_{Q_j}\cdot\hat{Q}_j \bmod Q$ , where $\hat{Q_j}:=\prod_{i\ne j}{Q_i}$ and all components have small norm. 2. Set $P=\prod_{i=0}^\alpha P_i$ s.t $P_i$ are primes and $|q_i|<|P_i|<\text{64 bits}$ for all $i$. 3. Define switching keys by a set of ciphertexts $\text{swk}^{(i)}_{s_{in}\rightarrow s_{out}}:=(a^*_i\cdot s_{out}+ [\hat{Q_i}^{-1}]_{Q_i}\cdot\hat{Q_i} \cdot P\cdot s_{in}+e^*_i, a^*_i) \bmod PQ$ 2. mod-up $[a]_{Q_i}$ to be modulo $P\cdot Q$ i.e., add CRT frames (no change in polynomial only its representation in CRT) and multiply, we have $ \begin{align} \big[[a]_{Q_i}\big]_{PQ}\cdot \text{swk}^{(i)}_{s_{in}\rightarrow s_{out}} &= [a]_{Q_i}\cdot \text{swk}^{(i)}_{s_{in}\rightarrow s_{out}} \bmod PQ \\ &= \big([a]_{Q_i}\cdot a^*_i\cdot s_{out} + [a]_{Q_i}\cdot [\hat{Q_i}^{-1}]_{Q_i}\cdot \hat{Q_i}\cdot P\cdot s_{in}+ [a]_{Q_i}\cdot e^*_i ,\ [a]_{Q_i}\cdot a^*_i\big)\bmod PQ \end{align} $ 3. mod_down to modulo Q 4. sum all results: $\begin{align} t &= \langle decomp(a), \text{swk}_{s_{in}\rightarrow s_{out}} \rangle \\ &= \sum_{i=0}^L \big([a]_{Q_i}\cdot \text{swk}^{(i)}_{s_{in}\rightarrow s_{out}}\big) \\ &= \bigg(\sum_{i=0}^L a^\dagger_i\cdot s_{out}+ a\cdot s_{in}+ \sum_{i=0}^L e_i^\dagger, \ \sum_{i=0}^L a^\dagger_i\bigg)\\ &=\bigg(a^{\dagger\dagger}\cdot s_{out}+a\cdot s_{in}+e^{\dagger\dagger}, \ \ a^{\dagger\dagger}\bigg) \end{align}$ where : $e_i^\dagger:=P^{-1}\cdot e^*_i[a\cdot\hat{Q_i}^{-1}]_{Q_i}$, $a_i^\dagger:=P^{-1}[a\cdot\hat{Q_i}^{-1}]_{Q_i}\cdot a^*_i$ , $\text{swk}:=\{\text{swk}^{(i)}_{s_{in}\rightarrow s_{out}}\}_{i=0}^L$ 5. subtract from original ciphertext: $ct_0- t=(as_{in}+m+e,\ a)-t=(a^{\dagger\dagger}s_{out}+m+e+e^{\dagger\dagger},\ \ a^{\dagger\dagger})$ where $a^{\dagger\dagger}:= \sum_{i=0}^L a^{\dagger}_i$ , $e^{\dagger\dagger}:=\sum_{i=0}^L e_i^{\dagger}$ ## Equivalent computation with inner product[^2] Given a ciphertext $(b, a)$ and $swk^j=\big([\text{swk}^{(i)}_{s_{in}\rightarrow s_{out}}]_j \big)_{i=0}^L$ for $j=0,1$ 1. Let $d$ be the decomposition of $a$ after modulo-up to $PQ$ 2. $t_1 = \langle d, swk^0 \rangle$ 3. $t_2=\langle d, swk^1\rangle$ 4. $t = \lfloor P^{-1}\cdot t_1 \rceil$ (mod down) 5. $a^{\dagger\dagger}=\lfloor P^{-1}\cdot t_2 \rceil$ (mod down) Notes - this method require that we compute $L/\alpha$ key switch algorithms compared to $L$ computations for the method using single $P$. - $a$ is required to be in CRT representation before calling mod-up in every method described above. - Using multiple "small" switch keys requires more memory: $(L+1) \Delta n\frac{L}{2}$ vs. $2L\Delta n$ of single key i.e., it is quadratic in $L$ the effective multiplication depth of the scheme, $n$ ring dimension and $\Delta$ the scale. But it enables more levels for homomorphic computation for secured setups. - $a$ is the random element of a given ciphertext and so it's decomposition is needed to be compute right before key switch and can't be precomputed during initialization of the scheme. ## Gadget decomposition description The above technique can be seen in more general terms: we would like to compute $a\cdot u\in R_Q$ for $a, u\in R_Q$ and $u$ is fixed. we transform $a$ to $(b_i)_{0\le i< d}$ s.t. $a=\sum_i b_i\cdot g_i\bmod Q$ and $g_i$ are fixed gadget basis. $u_i\in R_Q$ are chosen such that $a\cdot u =\sum_i b_i\cdot u_i\bmod Q$ for any $a$. This transformation is called *gadget decomposition*. In [[Gadget decomposition for key switch]] method another decomposition is computed for the key switch material $u$ that reduce computation complexity. Related: - [[History of key switch methods]] - [[Gadget Decomposition and External Product]] - Generate switch keys homomorphically: [[Hierarchical rotation key]],[[Generate rotation keys homomorphically by scheme conversion (RLWE<->LWE)]] ## Created 2022-03-01 09:30 [^2]: https://eprint.iacr.org/2020/1203.pdf#page=12.72, page 12