Update website section #5

Merged
loos merged 2 commits from mveril/QUESTDB:website into master 2020-11-26 11:16:18 +01:00

View File

@ -38,7 +38,7 @@
\newcommand{\fnt}{\footnotetext} \newcommand{\fnt}{\footnotetext}
\newcommand{\tabc}[1]{\multicolumn{1}{c}{#1}} \newcommand{\tabc}[1]{\multicolumn{1}{c}{#1}}
\newcommand{\QP}{\textsc{quantum package}} \newcommand{\QP}{\textsc{quantum package}}
\newcommand{\SupInf}{supporting information}%DJ: J'auais mis SI et aurais dŽfinit ˆ la premi<6D>re occurence \newcommand{\SupInf}{supporting information}%DJ: J'auais mis SI et aurais d<EFBFBD>finit <20> la premi<6D>re occurence
%Vector %Vector
\renewcommand{\vec}[1]{\bm{#1}} \renewcommand{\vec}[1]{\bm{#1}}
@ -1279,7 +1279,7 @@ Of course, one of the remaining open questions regarding all these methods is th
\label{sec:websiteIntro} \label{sec:websiteIntro}
%======================= %=======================
The previous QUEST publications \cite{Loos_2018a,Loos_2019,Loos_2020b,Loos_2020c,Loos_2020d} expose vertical excitation data, some statistics were provided considering the most relevant parameters. The previous QUEST publications \cite{Loos_2018a,Loos_2019,Loos_2020b,Loos_2020c,Loos_2020d} expose vertical excitation data, some statistics were provided considering the most relevant parameters.
But depending to the specific interest of quantum chemist this parameter selection can be irrelevant for his study. But depending to the specific interest of a quantum chemist this parameter selection can be irrelevant for his study.
Furthermore to determine the accuracy of a new method, it must be compared with reference data, such as those of the QUEST project. Furthermore to determine the accuracy of a new method, it must be compared with reference data, such as those of the QUEST project.
For this we have to calculate the same type of statistics for the new method. The QUESTDB website was created exactly to solve these issues. For this we have to calculate the same type of statistics for the new method. The QUESTDB website was created exactly to solve these issues.
%======================= %=======================
@ -1293,48 +1293,32 @@ The website specification are the following
\item Calculate statistics from these parameters \item Calculate statistics from these parameters
\item Display a box plot graph to easily show the methods accuracy \item Display a box plot graph to easily show the methods accuracy
\end{itemize} \end{itemize}
This solve the issues described at \ref{sec:websiteIntro} This solve the issues described previously section \ref{sec:websiteIntro}.
%======================= %=======================
\subsection{Usage} \subsection{Architecure}
%======================= %=======================
We built the website to meet mainly two useage. The website architecure is desinged to be simple and to facilitate the integration of new data.
\theoremstyle{break} It is composed of two part.
\theorembodyfont{\normalfont} \begin{itemize}
\newtheorem{scenar}{Scenario}{} \item A static website used to display data and statistics.
\begin{scenar} \item A series of python tools used to generate data readable by the website
\label{scenar:choose} \end{itemize}
The user wants to choose a method for his calculation or a series of calculations.
Of course he search a compromise between the accuracy and the cost of the method.
In this case he wants to compare the accuracy of each method with a subset of excitations data corresponding to his target.
He can optimise the filter to correspond to his target (Molecular size, molecule or excitation type).
If it is possible he can only select the target molecule when this molecule is available in the QUEST data.
\end{scenar}
\begin{scenar}
\label{scenar:new}
The user has created a new method and wants to compare its accuracy with the methods of the QUEST project.
Fistly he has to create an input file for the Python tools (see Sec.~\ref{sec:gentools}) by formating the calculated results as a {\LaTeX} \texttt{tabular}.
After the data generation using the same python tools we are used to import the QUEST data, he must to import the new absorption and the fluorescence data files using the button on the website.
So the new data are used in the same way than the references data to generate statistics and he can use the website to compute the statistics in order to compare the methods.
\end{scenar}
%=======================
\subsection{Project}
%=======================
The project containing two parts
%------------------------------------------------ %------------------------------------------------
\subsubsection{Website} \subsubsection{The static website}
%------------------------------------------------ %------------------------------------------------
This is the main part of the project. All the calculation are made locally on the dataset page. The static website is the main part. All the statistical calculation are made locally on the dataset page.
Firstly the website proposes to the user to import new data (see Sec.~\ref{sec:gentools}). The server is only responsible for serving the pages and data of the QUEST project to the client.
these data are added to the current session (and removed after lost the page). If you whant to work with the QUEST data you must to go to the dataset page.
Firstly the website proposes to the user to import new data (see Sec.~\ref{sec:gentools}),
these data are added temporarily to the current session (and removed after leaving the page).
There are four multi selection list. Each list depends on the previous ones. There are four multi selection list. Each list depends on the previous ones.
These lists allow to select information about the selected sets \ref{fig:scheme}. These lists allow to select information about the selected sets \ref{fig:scheme},
Molecules \ref{fig:molecules} methods and basis (see Sec.~\ref{sec:methods}). molecules \ref{fig:molecules}, methods and basis (see Sec.~\ref{sec:methods}).
After there are many filters to choose the properties of included excitations. After there are many filters to choose the properties of included excitations.
We provide also the ability to filter by molecule size or the active character percentage. We provide also the ability to filter by molecule size or the active character percentage.
After that we need to define a reference method to compare with (TBE by default). After that we need to define a reference method inside the already selected methods to compare with (TBE by default).
We also provide a flag to take off all the value declared not safe. We declared value as unsafe when the value have too big We also provide a flag to take off all the value declared not safe. We declared value as unsafe when the value have a too big
uncertainty. uncertainty.
\paragraph{Statistics calculations} \paragraph{Statistics calculations}
We want to calculate the accuracy of each couple method/basis compared to the reference (usually TBEs). We want to calculate the accuracy of each couple method/basis compared to the reference (usually TBEs).
@ -1365,26 +1349,45 @@ On the website the statistics are forwarded in a table and in a box plot graph.
\subsubsection{Data generation tools} \subsubsection{Data generation tools}
\label{sec:gentools} \label{sec:gentools}
%------------------------------------------------ %------------------------------------------------
There are multiple that we used to generate the data. There are multiple tools that we used to generate the data.
These tools can also be used by the user (see scenario \ref{scenar:new}) These tools can also be used by the user (see scenario \ref{scenar:new})
There are currently two main tools to generate data \texttt{datafileBuilder} and \texttt{ADC25generator} The main tools is \texttt{datafileBuilder} used to generate data files from a {\LaTeX} \texttt{tabular}.
\paragraph{datafileBuilder}
The \texttt{datafileBuilder} tool is used to build datafile from {\LaTeX} \texttt{tabular}.
The \texttt{tabular} is associated to some options and {\LaTeX} \texttt{\textbackslash newcommand} parsed by the main script and the \texttt{tabular} environment is converted to a \texttt{NumPy} 2d array. The \texttt{tabular} is associated to some options and {\LaTeX} \texttt{\textbackslash newcommand} parsed by the main script and the \texttt{tabular} environment is converted to a \texttt{NumPy} 2d array.
So the options, the {\LaTeX} \texttt{\textbackslash newcommand} to apply and the 2d array that represents the tabular environment are passed to the appropriate table parser module chosen using the \texttt{\textbackslash formatName} option in the input file. So the options, the {\LaTeX} \texttt{\textbackslash newcommand} to apply and the 2d array that represents the tabular environment are passed to the appropriate table parser module chosen using the \texttt{\textbackslash formatName} option in the input file.
Each module is responsible to parse the \texttt{tabular} and return all the corresponding dataFiles as object. Each module is responsible to parse the \texttt{tabular} and return all the corresponding dataFiles as object.
After, the main script output these objects to the corresponding files. Theses files can be used in the website After, the main script output these objects to the corresponding files. Theses files can be used in the website
By importing it temporarily or to make a pull request for the new data. By importing it temporarily or to make a pull request for the new data.
The modular aspect of this tool gives us enough flexibility to easily convert many types of {\LaTeX} \texttt{tabular} to a standardized file format. The modular aspect of this tool gives us enough flexibility to easily convert many types of {\LaTeX} \texttt{tabular} to a standardized file format.
\paragraph*{ADC25generator}
The \texttt{ADC25generator} tool merge ADC(2) and ADC(3) metadata and calculate the ADC(2.5) energy from ADC(2) and ADC(3) datafile as %=======================
\begin{equation} \subsection{Usage}
E_\text{ADC(2.5)} = \frac{E_\text{ADC(2)}+E_\text{ADC(3)}}{2} %=======================
\end{equation} \subsubsection{Manipulation}
and the value is considered as not safe when one or more value as not safe Firsly the user can add his own absorption and fluorescence data if he want to analyse a custom datasets.
\begin{equation} In the dataset tab the user can select the data of his interest by selecting the sets, meolecule, method and basis.
\mathrm{unsafe}_\text{ADC(2.5)} = \mathrm{unsafe}_\text{ADC(2)} \lor \mathrm{unsafe}_\text{ADC(3)} After that the user can customize the excitations he want to taking to acount.
\end{equation}
\subsubsection{Scenaros}
We built the website to meet mainly two useage.
\theoremstyle{break}
\theorembodyfont{\normalfont}
\newtheorem{scenar}{Scenario}{}
\begin{scenar}
\label{scenar:choose}
The user wants to choose a method for his calculation or a series of calculations.
Of course he search a compromise between the accuracy and the cost of the method.
In this case he wants to compare the accuracy of each method with a subset of excitations data corresponding to his target.
He can optimise the filter to correspond to his target (Molecular size, molecule or excitation type).
If it is possible he can only select the target molecule when this molecule is available in the QUEST data.
\end{scenar}
\begin{scenar}
\label{scenar:new}
The user has created a new method and wants to compare its accuracy with the methods of the QUEST project.
Fistly he has to create an input file for the Python tools (see Sec.~\ref{sec:gentools}) by formating the calculated results as a {\LaTeX} \texttt{tabular}.
After the data generation using the same python tools we are used to import the QUEST data, he must to import the new absorption and fluorescence data files using the button on the website,
so the new data are used in the same way than the references data to generate statistics.
After can use the website to compute the statistics in order to compare the methods.
\end{scenar}
} }
%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Concluding remarks} \section{Concluding remarks}