This commit is contained in:
jude 2023-04-16 21:28:29 +01:00
parent ad26788927
commit 1bd15f36e7
4 changed files with 49 additions and 31 deletions

View File

@ -21,7 +21,7 @@ function cryptoShuffle(l) {
return out; return out;
} }
const ROUNDS = 24; const ROUNDS = 48;
function getCoins(text) { function getCoins(text) {
// Construct verifier coins // Construct verifier coins

View File

@ -389,3 +389,11 @@ doi={10.1109/SP.2014.36}}
location = {Seoul, South Korea}, location = {Seoul, South Korea},
series = {SenSys '15} series = {SenSys '15}
} }
@article{ecma2024262,
title={{ECMAScript} 2024 Language Specification},
author={ECMA},
journal={ECMA (European Association for Standardizing Information and Communication Systems), pub-ECMA: adr,},
url = {https://tc39.es/ecma262}
year={1999}
}

Binary file not shown.

View File

@ -45,6 +45,14 @@
\maketitle \maketitle
\chapter*{}
\begin{center}
With thanks to Dr. Jim Laird and Dr. Guy McCusker.
\end{center}
\clearpage
\section{Outline} \section{Outline}
Risk is a strategy game developed by Albert Lamorisse in 1957. It is a highly competitive game, in which players battle for control over regions of a world map by stationing units within their territories in order to launch attacks on neighbouring territories that are not in their control. Risk is a strategy game developed by Albert Lamorisse in 1957. It is a highly competitive game, in which players battle for control over regions of a world map by stationing units within their territories in order to launch attacks on neighbouring territories that are not in their control.
@ -163,7 +171,7 @@ The main game protocol can be considered as the following graph mutations for a
The goal is then to identify ways to secure this protocol by obscuring the edges and weights, whilst preventing the ability for the player to cheat. The goal is then to identify ways to secure this protocol by obscuring the edges and weights, whilst preventing the ability for the player to cheat.
\subsubsection{Graphs \& ZKPs} \subsubsection{Graphs \& zero-knowledge proofs}
\cite{10.1145/116825.116852} identifies methods to construct zero-knowledge proofs for two graphs being isomorphic or non-isomorphic. \cite{10.1145/116825.116852} identifies methods to construct zero-knowledge proofs for two graphs being isomorphic or non-isomorphic.
@ -185,8 +193,6 @@ These challenges restrict the ability for the prover to cheat: if the two nodes
Selection between two challenges is the ideal number of challenges to use, as the probability of cheating being detected is $\frac{1}{2}$. Using more challenge options (e.g, $n$) means the likelihood of the prover cheating a single challenge reduces to $\frac{1}{n}$. This would require much larger numbers of communications to then convince the verifier to the same level of certainty. Selection between two challenges is the ideal number of challenges to use, as the probability of cheating being detected is $\frac{1}{2}$. Using more challenge options (e.g, $n$) means the likelihood of the prover cheating a single challenge reduces to $\frac{1}{n}$. This would require much larger numbers of communications to then convince the verifier to the same level of certainty.
Adjacency proofs are necessary to ensure that players move units fairly.
\subsubsection{Cheating with negative values} \subsubsection{Cheating with negative values}
Zerocash is a ledger system that uses zero-knowledge proofs to ensure consistency and prevent cheating. Ledgers are the main existing use case of zero-knowledge proofs, and there are some limited similarities between ledgers and Risk in how they wish to obscure values of tokens within the system. Zerocash is a ledger system that uses zero-knowledge proofs to ensure consistency and prevent cheating. Ledgers are the main existing use case of zero-knowledge proofs, and there are some limited similarities between ledgers and Risk in how they wish to obscure values of tokens within the system.
@ -205,7 +211,7 @@ A similar issue appears in the proposed system: a cheating player could update t
Some cryptosystems admit an additive homomorphic property: that is, given the public key and two encrypted values $\sigma_1 = E(m_1), \sigma_2 = E(m_2)$, the value $\sigma_1 + \sigma_2 = E(m_1 + m_2)$ is the ciphertext of the underlying operation. Some cryptosystems admit an additive homomorphic property: that is, given the public key and two encrypted values $\sigma_1 = E(m_1), \sigma_2 = E(m_2)$, the value $\sigma_1 + \sigma_2 = E(m_1 + m_2)$ is the ciphertext of the underlying operation.
\cite{paillier1999public} defined a cryptosystem based on residuosity classes, which expresses this property. \cite{damgard2003} demonstrates an honest-verifier zero-knowledge proof for proving a given value is 0. Hence, clearly, proving a summation $a + b = v$ can be performed by proving $v - a - b = 0$ in an additive homomorphic cryptosystem. \cite{paillier1999public} defined a cryptosystem based on composite residuosity classes, which expresses this property. \cite[Section~5.2]{damgard2003} demonstrates an honest-verifier zero-knowledge proof for proving a given value is 0. Hence, clearly, proving a summation $a + b = v$ can be performed by proving $v - a - b = 0$ in an additive homomorphic cryptosystem.
So, using some such scheme to obscure edge weights should enable verification of the edge values without revealing their actual values. So, using some such scheme to obscure edge weights should enable verification of the edge values without revealing their actual values.
@ -220,7 +226,7 @@ An alternative general protocol is the $\Sigma$-protocol \cite{groth2004honest}.
\end{itemize} \end{itemize}
This reduces the number of communications to a constant, even for varying numbers of challenges. This reduces the number of communications to a constant, even for varying numbers of challenges.
The Fiat-Shamir heuristic \cite{fiatshamir} provides another method to reduce communication by constructing non-interactive zero-knowledge proofs using a random oracle. For ledgers, non-interactive zero-knowledge proofs are necessary, as the ledger must be resilient to a user going offline. However, in our case, users should be expected to stay online for an entire session of Risk, and each session is self-contained. So this full transformation is not necessary. The Fiat-Shamir heuristic \cite{fiatshamir} provides another method to reduce communication by constructing non-interactive zero-knowledge proofs using a random oracle. For ledgers, non-interactive zero-knowledge proofs are necessary, as the ledger must be resilient to a user going offline. This is not the same in our case, however non-interactive zero-knowledge proofs are still beneficial. The amount of communications can be reduced significantly, and it is easier to write code that can settle a non-interactive proof than an interactive proof.
\subsubsection{Set membership proofs} \subsubsection{Set membership proofs}
@ -381,11 +387,11 @@ Random values are used in two places. \begin{itemize}
\item Rolling dice. \item Rolling dice.
\end{itemize} \end{itemize}
As this protocol must run many times during a game, we consider each operation of the protocol as a "session", each of which has a unique name that is derived from the context. This has another benefit as the unique name can then be used with the Web Lock API to prevent race conditions that may occur due to this protocol running in a non-blocking manner. As this protocol must run many times during a game, we consider each operation of the protocol as a "session", each of which has a unique name that is derived from the context. This has another benefit as the unique name can then be used with the Web Lock API to prevent race conditions that may occur due to this protocol running asynchronously.
\subsection{Proof system} \subsection{Proof system}
The first proof to discuss is that of \cite{damgard2003}. The authors give a method to prove knowledge of an encrypted value. The importance of using a zero-knowledge method for this is that it verifies knowledge to a single party. This party should be an honest verifier: this is an assumption we have made of the context, but in general this is not true, and so this provides an attack surface for colluding parties. The first proof to discuss is that of \cite[Section~5.2]{damgard2003}, in which the authors give a protocol to prove knowledge that a ciphertext is an encryption of zero. The importance of using a zero-knowledge method for this is that it verifies knowledge to a single party. This party should be an honest verifier: this is an assumption we have made of the context, but in general this is not true, and so this provides an attack surface for colluding parties.
The proof system presented is a Schnorr-style interactive proof for a given ciphertext $c$ being an encryption of zero. The proof system presented is a Schnorr-style interactive proof for a given ciphertext $c$ being an encryption of zero.
@ -519,7 +525,7 @@ The library jsSHA \cite{jssha} provides an implementation of SHAKE256 that works
Various parts of the implementation use the random oracle model: in particular, the zero-knowledge proof sections. Various parts of the implementation use the random oracle model: in particular, the zero-knowledge proof sections.
The random oracle model is used for two guarantees. The first is in the construction of truly random values that will not reveal information about the prover's state. In practice, a cryptographically secure pseudo random number generator will suffice for this application, as CSPRNGs typically incorporate environmental data to ensure outputs are unpredictable \cite{random(4)}. The random oracle model is used for two guarantees. The first is in the construction of truly random values that will not reveal information about the prover's state. In practice, a cryptographically secure pseudo-random number generator will suffice for this application, as CSPRNGs typically incorporate environmental data to ensure outputs are unpredictable \cite{random(4)}.
The second is to associate a non-random value with a random value. In practice, a cryptographic hash function such as SHA-3 is used. This gives appropriately pseudo-random outputs that appear truly random, and additionally are assumed to be preimage resistant: a necessary property when constructing non-interactive proofs in order to prevent a prover manipulating the signature used to derive the proof. The second is to associate a non-random value with a random value. In practice, a cryptographic hash function such as SHA-3 is used. This gives appropriately pseudo-random outputs that appear truly random, and additionally are assumed to be preimage resistant: a necessary property when constructing non-interactive proofs in order to prevent a prover manipulating the signature used to derive the proof.
@ -583,15 +589,15 @@ There is little room for optimisation of the mathematics in Paillier encryption.
Taking this idea further, one may simply cache $r^n$ for a number of randomly generated $r$ (as this is the slowest part of encryption). This eliminates the timing attack concern, and grants full flexibility with the values being encrypted. Taking this idea further, one may simply cache $r^n$ for a number of randomly generated $r$ (as this is the slowest part of encryption). This eliminates the timing attack concern, and grants full flexibility with the values being encrypted.
\textbf{Alternative Paillier scheme.} \cite{Jurik2003ExtensionsTT} presents an optimised encryption scheme based on the subgroup of elements with Jacobi symbol $+1$. This forms a group as the Jacobi symbol is multiplicative, being a generalisation of the Legendre symbol. \textbf{Alternative Paillier scheme.} \cite[Section~2.3.1]{Jurik2003ExtensionsTT} presents an optimised encryption scheme based on the subgroup of elements with Jacobi symbol $+1$. This forms a group as the Jacobi symbol is multiplicative, being a generalisation of the Legendre symbol.
Using this scheme alone reduced the time to encrypt by a half. Greater optimisations are possible through pre-computation of fixed-base exponentials, but this takes a considerable amount of time, and I found it infeasible within my implementation, since keypairs are only used for a single session. Using this scheme alone reduced the time to encrypt by a half. Greater optimisations are possible through pre-computation of fixed-base exponentials, but this takes a considerable amount of time, and I found it infeasible within my implementation, since keypairs are only used for a single session.
Furthermore, in practice gains were closer to a reduction by a third, since in the modified scheme additional computation must be performed to attain the $r$ that would work with normal Paillier, in order to perform the zero-knowledge proofs from before. In practice gains were closer to a reduction by a third, since in the modified scheme additional computation must be performed to attain the $r$ that would work with normal Paillier, in order to perform the zero-knowledge proofs from before.
\textbf{Smaller key size.} The complexity of Paillier encryption increases with key size. Using a smaller key could considerably reduce the time taken \cite{paillier1999public}. \textbf{Smaller key size.} The complexity of Paillier encryption increases with key size. Using a smaller key could considerably reduce the time taken \cite{paillier1999public}.
I tested this on top of the alternative Paillier scheme from above. This resulted in linear reductions in encryption time: encryption under a 1024-bit modulus took half the amount of time as under a 2048-bit modulus and so on. I tested this on top of the alternative Paillier scheme from above. This resulted in linear reductions in encryption time: encryption under a 1024-bit modulus took a sixth of the amount of time as under a 2048-bit modulus, and encryption under a 2048-bit modulus took a sixth of the amount of time as under a 4096-bit modulus.
\textbf{Vectorised plaintexts.} The maximum size of a plaintext is $|n|$: in our case, this is 4096 bits. By considering this as a vector of 128 32-bit values, peers could use a single ciphertext to represent their entire state. \cite{10.1145/2809695.2809723} uses this process to allow embedded devices to make use of the homomorphic properties of Paillier. \textbf{Vectorised plaintexts.} The maximum size of a plaintext is $|n|$: in our case, this is 4096 bits. By considering this as a vector of 128 32-bit values, peers could use a single ciphertext to represent their entire state. \cite{10.1145/2809695.2809723} uses this process to allow embedded devices to make use of the homomorphic properties of Paillier.
@ -603,7 +609,7 @@ The other proofs do not translate so trivially to this structure however. In fac
\subsection{Complexity results} \subsection{Complexity results}
All measurements taken on Brave 1.50.114 (Chromium 112.0.5615.49) 64-bit, using a Ryzen 5 3600 CPU. Absolute timings are extremely dependent on the browser engine: for example Firefox 111.0.1 was typically 4 times slower than the results shown. All measurements were taken on Brave 1.50.114 (Chromium 112.0.5615.49) 64-bit, using a Ryzen 5 3600 CPU: a consumer CPU from 2019. Absolute timings are extremely dependent on the browser engine: for example Firefox 111.0.1 was typically 4 times slower than the results shown.
\begin{table}[htp] \begin{table}[htp]
\fontsize{10pt}{10pt}\selectfont \fontsize{10pt}{10pt}\selectfont
@ -622,29 +628,31 @@ All measurements taken on Brave 1.50.114 (Chromium 112.0.5615.49) 64-bit, using
\begin{table}[htp] \begin{table}[htp]
\fontsize{10pt}{10pt}\selectfont \fontsize{10pt}{10pt}\selectfont
\caption{Time to process proofs} \caption{Time to process proofs}
\begin{tabularx}{\textwidth}{c *4{>{\Centering}X}} \begin{tabularx}{\textwidth}{c *6{>{\Centering}X}}
\toprule \toprule
\multirow{2}{*}{Modulus size} & \multicolumn{2}{c}{Proof-of-zero non-interactive} & \multicolumn{2}{c}{\hyperref[protocol1]{Protocol~\ref*{protocol1}} with $t = 24$} \tabularnewline \cmidrule(l){2-3}\cmidrule(l){4-5} \multirow{2}{*}{Modulus size} & \multicolumn{2}{c}{Proof-of-zero non-interactive} & \multicolumn{2}{c}{\hyperref[protocol1]{Protocol~\ref*{protocol1}} with $t = 24$} & \multicolumn{2}{c}{\hyperref[protocol1]{Protocol~\ref*{protocol1}} with $t = 48$} \tabularnewline
& Prover & Verifier & Prover & Verifier \\ \cmidrule(l){2-3}\cmidrule(l){4-5}\cmidrule(l){6-7}
& Prover & Verifier & Prover & Verifier & Prover & Verifier \\
\midrule \midrule
$|n| = 1024$ & 10ms & 18ms & 1,740ms & 2,190ms \\ $|n| = 1024$ & 10ms & 18ms & 1,740ms & 2,190ms & 3,380ms & 4,270ms \\
$|n| = 2048$ & 44ms & 68ms & 8,170ms & 8,421ms \\ $|n| = 2048$ & 44ms & 68ms & 8,170ms & 8,420ms & 16,100ms & 16,200ms \\
$|n| = 4096$ & 225ms & 292ms & 41,500ms & 34,405ms \\ $|n| = 4096$ & 225ms & 292ms & 41,500ms & 34,400ms & 83,200ms & 68,400ms \\
\bottomrule \bottomrule
\end{tabularx} \end{tabularx}
\end{table} \end{table}
\begin{table}[htp] \begin{table}[htp]
\fontsize{10pt}{10pt}\selectfont \fontsize{10pt}{10pt}\selectfont
\caption{Byte size\parnote{1 UTF-16 character, as used by ECMAScript, is 2 or more bytes} of encoded proofs} \caption{Byte size\parnote{1 UTF-16 character, as used by ECMAScript \cite[Section~6.1.4]{ecma2024262}, is 2 or more bytes.} of encoded proofs}
\begin{tabularx}{\textwidth}{c *4{>{\Centering}X}} \begin{tabularx}{\textwidth}{c *6{>{\Centering}X}}
\toprule \toprule
\multirow{2}{*}{Modulus size} & \multicolumn{2}{c}{Proof-of-zero non-interactive} & \multicolumn{2}{c}{\hyperref[protocol1]{Protocol~\ref*{protocol1}} with $t = 24$} \tabularnewline \cmidrule(l){2-3}\cmidrule(l){4-5} \multirow{2}{*}{Modulus size} & \multicolumn{2}{c}{Proof-of-zero non-interactive} & \multicolumn{2}{c}{\hyperref[protocol1]{Protocol~\ref*{protocol1}} with $t = 24$} & \multicolumn{2}{c}{\hyperref[protocol1]{Protocol~\ref*{protocol1}} with $t = 48$} \tabularnewline
& JSON & with LZ-String & JSON & with LZ-String \\ \cmidrule(l){2-3}\cmidrule(l){4-5}\cmidrule(l){6-7}
& JSON & with LZ-String & JSON & with LZ-String & JSON & with LZ-String \\
\midrule \midrule
$|n| = 1024$ & 1,617B & 576B (35.62\%) & 338,902B & 95,738B (28.25\%) \\ $|n| = 1024$ & 1,617B & 576B & 338,902B & 95,738B & 673,031B & 186,857B \\
$|n| = 2048$ & 3,153B & 1,050B (33.30\%) & 662,233B & 187,333B (28.29\%) \\ $|n| = 2048$ & 3,153B & 1,050B & 662,233B & 187,333B & 1,315,463B & 365,086B \\
$|n| = 4096$ & 6,226B & 1,999B (32.11\%) & 1,315,027B & 368,646B (28.03\%) \\ $|n| = 4096$ & 6,226B & 1,999B & 1,315,027B & 368,646B & 2,609,131B & 721,891B \\
\bottomrule \bottomrule
\end{tabularx} \end{tabularx}
\parnotes \parnotes
@ -652,7 +660,7 @@ All measurements taken on Brave 1.50.114 (Chromium 112.0.5615.49) 64-bit, using
\subsection{Quantum resistance} \subsection{Quantum resistance}
The security of Paillier relies upon the difficulty of factoring large numbers \cite{paillier1999public}. Therefore, it is vulnerable to the same quantum threat as RSA is, which is described by \cite{shor_1997}. Alternative homomorphic encryption schemes are available, which are widely believed to be quantum-resistant, as they are based on lattice methods (e.g, \cite{fhe}). Paillier is broken if factoring large numbers is computationally feasible \cite[Theorem~9]{paillier1999public}. Therefore, it is vulnerable to the same quantum threat as RSA is, which is described by \cite{shor_1997}. Alternative homomorphic encryption schemes are available, which are believed to be quantum-resistant, as they are based on lattice methods (e.g, \cite{fhe}).
\subsection{Side-channels} \subsection{Side-channels}
@ -689,17 +697,19 @@ Finally, I present a summary of other limitations that I encountered.
\subsection{JavaScript} \subsection{JavaScript}
To summarise, JavaScript was the incorrect choice of language for this project. Whilst the event-based methodology was useful, I believe overall that JavaScript hampered development. JavaScript was the incorrect choice of language for this project. Whilst the event-based methodology was useful, I believe overall that JavaScript made development much more difficult.
JavaScript is a slow language. Prime generation takes a considerable amount of time, and this extends to encryption and decryption being slower than in an implementation in an optimising compiled language. JavaScript is a slow language. Prime generation takes a considerable amount of time, and this extends to encryption and decryption being slower than in an implementation in an optimising compiled language. This was seen in the results shown before.
JavaScript's type system makes debugging difficult. It is somewhat obvious that this problem is far worse in systems with more interacting parts, which this project certainly was. TypeScript may have been a suitable alternative, but most likely the easiest solution was to avoid both and go with a language that was designed with stronger typing in mind from the outset (even Python would likely have been easier, as there is at least no issue of \texttt{undefined}ness, and the language was designed with objects in mind from the start). JavaScript's type system makes debugging difficult. It is somewhat obvious that this problem is far worse in systems with more interacting parts. TypeScript may have been a suitable alternative, but most likely the easiest solution was to avoid both and go with a language that was designed with stronger typing in mind from the outset (even Python would likely have been easier, as there is at least no issue of \texttt{undefined}ness, and the language was designed with objects in mind from the start).
JavaScript is a re-entrant language: this means that the interpreter does not expose threads or parallelism to the developer, but it may still use threads under-the-hood and switch contexts to handle new events. This introduces the possibility of race conditions despite no explicit threading being used. The re-entrant nature is however beneficial to a degree, as it means that long-running code won't cause the WebSocket to close or block other communications from being processed.
\subsection{General programming} \subsection{General programming}
Peer-to-peer programming requires a lot more care than client-server programming. This makes development far slower and far more bug-prone. As a simple example, consider the action of taking a turn in Risk. In the peer-to-peer implementation presented, each separate peer must keep track of how far into a turn a player is, check if a certain action would end their turn (or if its invalid), contribute in verifying proofs, and contribute in generating randomness for dice rolls. In a client-server implementation, the server would be able to handle a turn by itself, and could then propagate the results to the other clients in a single predictable request. Peer-to-peer programming requires a lot more care than client-server programming. This makes development far slower and far more bug-prone. As a simple example, consider the action of taking a turn in Risk. In the peer-to-peer implementation presented, each separate peer must keep track of how far into a turn a player is, check if a certain action would end their turn (or if its invalid), contribute in verifying proofs, and contribute in generating randomness for dice rolls. In a client-server implementation, the server would be able to handle a turn by itself, and could then propagate the results to the other clients in a single predictable request.
The use of big integers leads to peculiar issues relating to signedness. This is in some ways a JavaScript issue, but would also be true in other languages. Taking modulo $n$ of a negative number tends to return a negative number, rather than a number squashed into the range $[0, n]$. This leads to inconsistencies when calculating the GCD or finding Bezout coefficients. In particular, this became an issue when trying to validate proofs of zero, as the GCD returned $-1$ rather than $1$ in some cases. Resolving this simply required changing the update and encrypt functions to add the modulus until the representation of the ciphertext was signed correctly. Whilst the fix for this was simple, having to fix this in the first place is annoying, and using a non-numerical type (such as a byte stream) may resolve this in general. The use of big integers leads to peculiar issues relating to signedness. This is in some ways a JavaScript issue, but would also be true in other languages. Taking modulo $n$ of a negative number tends to return a negative number, rather than a number within the range $[0, n]$. This leads to inconsistencies when calculating the GCD or finding Bezout coefficients. In particular, this became an issue when trying to validate proofs of zero, as the GCD returned $-1$ rather than $1$ in some cases. Resolving this simply required changing the update and encrypt functions to add the modulus until the representation of the ciphertext was signed correctly. Whilst the fix for this was simple, having to fix this in the first place is annoying, and using a non-numerical type (such as a byte stream) may resolve this in general.
\subsection{Resources} \subsection{Resources}