Page MenuHomec4science

6__DE.tex
No OneTemporary

File Metadata

Created
Fri, Nov 8, 12:48

6__DE.tex

In the following the model is applied to the french ID-cards. The training set of German ID-cards was not used, instead the checkpoint from the training on the Swiss ID-cards (with 3 labels each) is used.
The labels are just given in german, french and english language. A comparison of the german labels and the swiss labels can be seen in table \ref{FI_IT_DE_labels}. The following labels are very similar to the Swiss label. They don't have as many languages, but for the french part its almost the same words. Thus they are easy to associate.
\begin{multicols}{3}
\begin{itemize}[itemsep=-1ex]
\item first name
\item last name
\item date of birth
\item date of expiry
\item height
\item card number
\item nationality
\end{itemize}
\end{multicols}
There is no label for the persons gender in the german ID card. But there is the label of the eye color, which is a complete new concept. The valid values of the eye colors are set to ""bernstein", "grün", "grünbraun", "grau", "blau", "hellbraun", "hellblau" and "blaugrün". These are basically different shades of brown, blue and green. When applying directly the checkpoint from the swiss training, there are very few correct information extractions (see figure \ref{DE_tp_eye_color}). But finetuning on a 100 ID cards (unsupervised) the amount of true positive can be increased to about 20\% (figure \ref{DE1_tp_eye_color}). Finetuning on another 100 ID cards it increases further to roughly 50\% (figure \ref{DE2_tp_eye_color}). Starting with only 3\% of the cases being true positive this is a surprisingly good result. Especially as in the first finetuning iteration a lot of false positive examples have been used.
\begin{figure}
\centering
\begin{subfigure}{\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_f1_0}
\caption{Using checkpoint trained on CH-ID's}
\label{DE_f1}
\end{subfigure}%
\begin{subfigure}{0.48\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_f1_1}
\caption{After unsupervised fine tuning using 100 ID-card-examples (1000 queries)}
\label{DE2_f1}
\end{subfigure}%
\hspace{10pt}
\begin{subfigure}{0.48\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_f1_2}
\caption{After unsupervised fine tuning using 200 ID-card-examples (2000 queries)}
\label{DE3_f1}
\end{subfigure}%
\vspace{10pt}
% \begin{subfigure}{0.5\textwidth}
% \centering
% \includegraphics[width=.99\linewidth]{plots/FR/f_f1_3}
% \caption{After unsupervised fine tuning using 300 ID-card-examples (3000 queries)}
% \label{FR3}
% \end{subfigure}%
\caption{F1 score on German ID-cards. In the sub-figure \ref{DE_f1} the checkpoint trained on the CH-dataset is immediately applied. Due to differences in formats and other labels, there many information extractions are not an exact match. Performance increases very fast when fine tuning on the new dataset.The main performance gain is reached already with just 100 new examples. Fine tuning is unsupervised, meaning the answer is extracted by the model itself and is thus sometimes wrong. But there is no need for human feedback (expert knowledge).}
\label{DEs_f1}
\end{figure}
\begin{figure}
\centering
\begin{subfigure}{\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_k_1_0}
\caption{Using checkpoint trained on CH-ID's}
\label{DE_k}
\end{subfigure}%
\begin{subfigure}{0.48\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_k_1_1}
\caption{After unsupervised fine tuning using 100 ID-card-examples (1000 queries)}
\label{DE2_k}
\end{subfigure}%
\hspace{10pt}
\begin{subfigure}{0.48\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_k_1_2}
\caption{After unsupervised fine tuning using 200 ID-card-examples (2000 queries)}
\label{DE3_k}
\end{subfigure}%
\vspace{10pt}
% \begin{subfigure}{0.5\textwidth}
% \centering
% \includegraphics[width=.99\linewidth]{plots/FR/f_f1_3}
% \caption{After unsupervised fine tuning using 300 ID-card-examples (3000 queries)}
% \label{FR3}
% \end{subfigure}%
\caption{W score on German ID-cards. In the sub-figure \ref{DE_k} the checkpoint trained on the CH-dataset is immediately applied. Due to differences in formats and other labels, there many information extractions are not an exact match. Performance increases very fast when fine tuning on the new dataset.The main performance gain is reached already with just 100 new examples. Fine tuning is unsupervised, meaning the answer is extracted by the model itself and is thus sometimes wrong. But there is no need for human feedback (expert knowledge).}
\label{DEs_k}
\end{figure}
\begin{figure}
\centering
\begin{subfigure}{\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_tp_all_0}
\caption{Using checkpoint trained on CH-ID's}
\label{DE_tp}
\end{subfigure}%
\begin{subfigure}{0.48\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_tp_all_1}
\caption{After unsupervised fine tuning using 100 ID-card-examples (1000 queries)}
\label{DE1_tp}
\end{subfigure}%
\hspace{10pt}
\begin{subfigure}{0.48\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_tp_all_2}
\caption{After unsupervised fine tuning using 200 ID-card-examples (2000 queries)}
\label{DE2_tp}
\end{subfigure}%
\vspace{10pt}
% \begin{subfigure}{0.5\textwidth}
% \centering
% \includegraphics[width=.99\linewidth]{plots/FR/f_f1_3}
% \caption{After unsupervised fine tuning using 300 ID-card-examples (3000 queries)}
% \label{FR3_tp}
% \end{subfigure}%
\caption{Number of true positive, true negative, false positive and false positive in the German ID-cards test set (first 1000 queries). In the sub-figure \ref{FR} the checkpoint trained on the CH-dataset is immediately applied. Due to differences in formats and other labels, there many information extractions are not an exact match. Note that the amount of false positive is going towards zero when only considering examples with high confidence score. The amount of true positive increases very fast when fine tuning on the new dataset. Fine tuning is unsupervised, meaning the answer is extracted by the model itself and is thus sometimes wrong. But there is no need for human feedback (expert knowledge).}
\label{DEs_tp}
\end{figure}
\begin{figure}
\centering
\begin{subfigure}{\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_tp_eyeColor_0}
\caption{Using checkpoint trained on CH-ID's}
\label{DE_tp_eye_color}
\end{subfigure}%
\begin{subfigure}{0.48\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_tp_eyeColor_1}
\caption{After unsupervised fine tuning using 100 ID-card-examples (1000 queries)}
\label{DE1_tp_eye_color}
\end{subfigure}%
\hspace{10pt}
\begin{subfigure}{0.48\textwidth}
\centering
\includegraphics[width=.99\linewidth]{plots/DE/f_tp_eyeColor_2}
\caption{After unsupervised fine tuning using 200 ID-card-examples (2000 queries)}
\label{DE2_tp_eye_color}
\end{subfigure}%
\vspace{10pt}
% \begin{subfigure}{0.5\textwidth}
% \centering
% \includegraphics[width=.99\linewidth]{plots/FR/f_f1_3}
% \caption{After unsupervised fine tuning using 300 ID-card-examples (3000 queries)}
% \label{FR3_tp}
% \end{subfigure}%
\caption{Number of true positive, true negative, false positive and false positive in German ID-cards for the unknown label "eye color". In the sub-figure \ref{DE_unk_tp} the checkpoint trained on the CH-dataset is immediately applied. To fine tune the model the models answers with a high confidence score are used. They can contain wrong answers (false positive).}
\label{DEs_tp_eye_color}
\end{figure}

Event Timeline