\chapter{Performance Analysis of EC-ElGamal Solution}
Our first goal is to select the most efficient elliptic curve, for our ECCEG scheme, allowing a MU to perform susceptibility tests on patients in a short time without violating their genome privacy. Then, we want to compare the efficiency of the Paillier with the ElGamal solution in term of genome storage and susceptibility test runtime. Finally, we analyze the monetary stakes of both solutions.
\section{Choice Space}
We decide to choose an optimal elliptic curve for our application based on the field size of the elliptic curve and on the type of curve used. First we compare the elliptic curve cryptography security with the Paillier solution. Then we discuss the different type of curves available to select the most efficient and economic one.\par
The current solution uses Paillier Encryption Scheme with a security parameter of 4096 bits. We want to provide with the new solution a security at least as good as with Paillier. Paillier is in the class of Integer Factorization Cryptography (IFC) as RSA. We compare the strength of both cryptosystems using \reftab{tab:comp-key-strength}. We observe that for a key of 4096 bits for IFC, we need a field size above 384 bits for elliptic curve solutions.
80 & 2TDEA & L = 1024, N = 160 & k = 1024 & f = 160 - 223 \\
112 & 3TDEA & L = 2048, N = 224 & k = 2048 & f = 224 - 255 \\
128 & AES-128 & L = 3072, N = 256 & k = 3072 & f = 256 - 383 \\
192 & AES-192 & L = 7680, N = 384 & k = 7680 & f = 384 - 511 \\
256 & AES-256 & L = 15360, N = 512 & k = 15360 & f = 512+ \\
\end{tabular}
\caption{Comparative Key Strength \cite[Table 2, page 64]{NIST-800-57} between Symmetric key algorithms, Finite Field Cryptography (FFC), Integer Factorization Cryptography (IFC) and Elliptic-Curve Cryptography (ECC).}
\label{tab:comp-key-strength}
\end{table}
There exist different type of curves. They can be defined over primary or binary fields. Furthermore, there exist a special case of elliptic curves over binary fields known as Koblitz curves which have faster point multiplications. As suggested by \cite[section 3.4]{hankerson2004guide}\citep{Ugus_performanceof,MA_Ugus}, various elliptic curves can have different performance in term of additions, point multiplications and storage.
We perform this analysis on all the elliptic curves suggested by NIST\citep{fips-dss} provided by the BouncyCastle library. The full list of NIST available curves on BouncyCastle can be found in \refappendix{sec:nist-rec-curves}. We compare empirically the performance of each of the different elliptic curve.\par
We measure the size of an encoded compressed pair of elliptic curve points to obtain the memory requirements of a ciphertext for a given elliptic curve. We can see on \reffig{graph:perf-storage} that the storage requirement increases linearly with the size of field. We can also see that the storage is independent on the type of elliptic curve.
\begin{figure}
\begin{center}
\input{plots/perf_storage_comp.tex}
\end{center}
\caption{Storage requirement to store the set of genetic variations for different elliptic curves.}
\label{graph:perf-storage}
\end{figure}
Then we measure the runtime of multiple susceptibility tests to obtain the mean duration of a test. Standard susceptibility tests are performed over 1 to 50 SNPs. We compute artificial susceptibility test over 20 SNPs. We generate random SNP values and random weights, and perform a weighted average on the generated material. As the runtime of one test is rather short, we perform 100 different tests and report the mean runtime. On \reffig{graph:perf-runtime}, we can see that elliptic curves over primary field or Koblitz curves have much smaller runtime than normal curves over binary fields. Also for a larger field size the Koblitz curve has the same runtime as the primary one.
\begin{figure}
\begin{center}
\input{plots/perf_runtime_comp.tex}
\end{center}
\caption{Runtime to compute a weighted average for 20 SNPs for different elliptic curves.}
\label{graph:perf-runtime}
\end{figure}
\section{Selected Curve}\label{sec:select-curve}
In the analysis above, we have seen that we need to choose an elliptic curve with field of length at least 384 bits. We chose the NIST curve P-384 as it minimizes the storage and does not have any drawback in term of runtime. Also it provides security as good as the current Paillier solution.
\section{Comparison of Paillier with ElGamal on EC}
We compare in this section Paillier cryptosystem using a key of length 4096 bit and ElGamal cryptosystem on Elliptic Curve using the P-384 curve as motivated in \refsec{sec:select-curve}. We compare both systems in term of storage and runtime performance and in term of monetary cost. We base our analysis on the set of genetic variations composed of 5 millions SNPs.\par
We can see on \reftab{tab:perf-comp-sum} that ElGamal solution is 15 times faster than the Paillier solution for running a weighted average over 20 values. In term of storage requirement, ElGamal on EC is more than 10 times cheaper than than the Paillier solution. The encryption is 18 times faster with ElGamal. The partial decryption using a first secret key share is 3 times faster with ElGamal and the decryption completion has the same order of magnitude. We can clearly see that ElGamal on EC cryptosystem can provide much more efficient susceptibility tests while providing security equivalent to current Paillier cryptosystem configuration.
\caption{Performance comparison of Pailler (key size = 4096bits) against ElGamal on EC (field size = 384bits). Storage for full set of genetic variations. Runtime for a weighted average over 20 SNPs. Tests are performed on a computer running Windows 7 with an Intel Core i5 Processors of 2.4GHz.}
\label{tab:perf-comp-sum}
\end{table}
The monetary stakes are also analyzed. We select two different providers: Amazon S3 as global storage provider and Swisscom Dynamic Computing Services (DCS) as local storage provider. Amazon S3 does not provide any guarantee on the data center locations. Swisscom DCS guarantees that all the data will be stored in data centers in Switzerland \citep{swisscom-dcs-sales}. We want to emphasize the difference between traditional storage of non-sensitive data which could be cost efficiently done with Amazon S3 services and the need of storing sensitive medical data inland, due to legal or regulatory requirements in Switzerland, with a national provider. Pricing of both alternatives are shown in \reftab{tab:storage-pricing}.\par
\begin{table}
\centering
\begin{tabular}{l|l|l|c}
Provider location & Storage provider & Service name & Cost per GB per month \\
\caption{Pricing for storage with duplication for Amazon Web Services and Swisscom Dynamic Computing Services (DCS).}
\label{tab:storage-pricing}
\end{table}
On \reffig{fig:comp-mon} we can see that storing all set of variations at the level of Switzerland with a Swiss storage provider costs CHF 4.58 Mio for Paillier solution when only CHF 0.44 Mio for ElGamal on EC solution. In other words, the storage of the full set of genetic variations for a patient with ElGamal on EC would cost about CHF 0.05 per year. This cost is easily be absorbed by any health system or by any state. That large difference in term of cost at a national level could significantly change the adoption rate of pharmacogenetic solutions.
\begin{figure}
\begin{center}
\input{plots/perf_monetary_comp.tex}
\end{center}
\caption{Monetary comparison of Paillier with ElGamal on EC.}