The many-body Green's function Bethe-Salpeter formalism is steadily asserting itself as a new efficient and accurate tool in the armada of computational methods available to chemists in order to predict neutral electronic excitations in molecular systems.
In particular, the combination of the so-called $GW$ approximation of many-body perturbation theory, giving access to reliable charged excitations and quasiparticle energies, and the Bethe-Salpeter formalism, allowing to catch excitonic effects, has shown to provide accurate excitation energies in many chemical scenarios with a typical error of $0.1$--$0.3$ eV.
In this \textit{Perspective} article, we provide a historical overview of the Bethe-Salpeter formalism, with a particular focus on its condensed-matter roots, and we propose a critical review of its strengths and weaknesses for different chemical situations, such as \titou{bla bla bla}.
Future directions of developments and improvements are also discussed.
In its press release announcing the attribution of the 2013 Nobel prize in Chemistry to Karplus, Levitt and Warshel, the Royal Swedish Academy of Sciences concluded by stating \textit{``Today the computer is just as important \titou{as} a tool for chemists as the test tube. Simulations are so realistic that they predict the outcome of traditional experiments.''} Martin Karplus Nobel lecture moderated this bold statement, introducing his presentation by a 1929 quote from Dirac emphasizing that laws of quantum mechanics are \textit{``much too complicated to be soluble''}, urging the scientist to develop \textit{``approximate practical methods''}. This is where the methodology community stands, attempting to develop robust approximations to study with increasing accuracy the properties of ever more complex systems.
The study of neutral electronic excitations in condensed matter systems, from molecules to extended solids, has witnessed the development of a large number of such approximate methods with numerous applications to a large variety of fields, from the prediction of the colour of precious metals and stones for jewellery, to the understanding, \eg, of the basic principles behind photovoltaics, photocatalysis or DNA damage under irradiation in the context of biology. The present Perspective aims at describing the current status and upcoming challenges for the Bethe-Salpeter equation (BSE) formalism that, while sharing many features with time-dependent density functional theory (TD-DFT), including computational cost scaling with system size, relies on a different formalism, with specific difficulties but also potential solutions to known difficulties.
The Bethe-Salpeter equation formalism \cite{Salpeter_1951,Strinati_1988,Albrecht_1998,Rohlfing_1998,Benedict_1998,vanderHorst_1999} belongs to the family of Green's function many-body perturbation theories (MBPT) \cite{Hedin_1965,Aryasetiawan_1998,Onida_2002,Reining_2017,ReiningBook} together with e.g. the algebraic diagrammatic construction (ADC) techniques in quantum chemistry. \cite{Dreuw_2015} While the density and density matrix stand as the basic variables in DFT and Hartree-Fock, Green's function MBPT takes the one-body Green's function as the central quantity. The (time-ordered) one-body Green's function reads:
where $| N \rangle$ is the N-electron ground-state wavefunction. The operators ${\hat\psi}(xt)$ and ${\hat\psi}^{\dagger}(x't')$ remove/add an electron in space-spin-time positions (xt) and (x't'), while $T$ is the time-ordering operator. For (t>t') the one-body Green's function provides the amplitude of probability of finding, on top of the ground-state Fermi sea, an electron in (xt) that was previously introduced in (x't'), while for (t<t') it is the propagation of a hole that is monitored. \\
\noindent{\textbf{Charged excitations.}} A central property of the one-body Green's function is that its spectral representation presents poles at the charged excitation energies of the system :
where $\varepsilon_s = E_s(N+1)- E_0(N)$ for $\varepsilon_s > \mu$ ($\mu$ chemical potential, $\eta$ small positive infinitesimal) and $\varepsilon_s = E_0(N)- E_s(N-1)$ for $\varepsilon_s < \mu$. The quantities $E_s(N+1)$ and $E_s(N-1)$ are the total energy of the $s$-th excited state of the $(N+1)$ and $(N-1)$-electron systems, while $E_0(N)$ is the $N$-electron ground-state energy. Contrary to the Kohn-Sham eigenvalues, the Green's function poles $\lbrace\varepsilon_s \rbrace$ are thus the proper charging energies of the N-electron system, leading to well-defined ionization potentials and electronic affinities. Contrary to standard $\Delta$SCF techniques, the knowledge of $G$ provides the full ionization spectrum, as measured by direct and inverse photoemission, not only that associated with frontier orbitals. The $\lbrace f_s \rbrace$ are called the Lehmann amplitudes that reduce to one-body orbitals in the case of single-determinent many-body wave functions [more ??].
where we introduce the usual composite index, \eg, $1\equiv(x_1,t_1)$. Here, $h$ is the one-body Hartree Hamiltonian and $\Sigma$ the so-called exchange-correlation self-energy operator. Using the spectral representation of $G$, one obtains a familiar eigenvalue equation:
which resembles formally the Kohn-Sham equation with the difference that the self-energy $\Sigma$ is non-local, energy dependent and non-hermitian. The knowledge of the self-energy operators allows thus to obtain the true addition/removal energies, namely the entire spectrum of occupied and virtual electronic energy levels, at the cost of solving a generalized one-body eigenvalue equation. [INTRODUCE QUASIPARTICLES and OTHER solutions ??] \\
\noindent{\textbf{The $GW$ self-energy.}} While the presented equations are formally exact, it remains to provide an expression for the exchange-correlation self-energy operator $\Sigma$. This is where Green's function practical theories differ. Developed by Lars Hedin in 1965 with application to the interacting homogeneous electron gas, \cite{Hedin_1965} the $GW$ approximation
\cite{Aryasetiawan_1998,Farid_1999,Onida_2002,Ping_2013,Leng_2016,Golze_2019rev} follows the path of linear response by considering the variation of $G$ with respect to an external perturbation. The obtained equation, when compared with the equation for the time-evolution of $G$ (Eqn.~\ref{Gmotion}), leads to a formal expression for the self-energy :
\Sigma(1,2) = i \int d34 \; G(1,4) W(3,1^{+}) \Gamma(42,3)
\end{equation}
where $W$ is the dynamically screened Coulomb potential and $\Gamma$ a ``vertex" function that can be written as $\Gamma(12,3)=\delta(12)\delta(13)+\mathcal{O}(W)$ where $\mathcal{O}(W)$ means a corrective term with leading linear order in terms of $W$. The neglect of the vertex leads to the so-called $GW$ approximation for $\Sigma$ that can be regarded as the lowest-order perturbation in terms of the screened Coulomb potential $W$ with :
with $\chi_0$ the well-known independent electron susceptibility and $v$ the bare Coulomb potential. In practice, the input $G$ and $\chi_0$ needed to buld $\Sigma$ are taken to be the best Green's function and susceptibility that can be easily calculated, namely the DFT (or HF) ones where the $\lbrace\varepsilon_s, f_s \rbrace$ of equation~\ref{spectralG} are taken to be DFT Kohn-Sham (or HF) eigenstates. Taking then $(\Sigma^{\GW}-V^{\XC})$ as a correction to the $V^{\XC}$ DFT exchange correlation potential, a first-order correction to the input Kohn-Sham $\lbrace\varepsilon_n^{KS}\rbrace$ energies is obtained as follows:
Such an approach, where input Kohn-Sham energies are corrected to yield better electronic energy levels, is labeled the single-shot, or perturbative, $G_0W_0$ technique. This simple scheme was used in the early $GW$ studies of extended semiconductors and insulators, \cite{Strinati_1980,Hybertsen_1986,Godby_1988,Linden_1988}
surfaces, \cite{Northrup_1991,Blase_1994,Rohlfing_1995} and 2D systems, \cite{Blase_1995} allowing to dramatically reduced the errors associated with Kohn-Sham eigenvalues in conjunction with common local or gradient-corrected approximations to the exchange-correlation potential.
In particular, the well-known ``band gap" problem, \cite{Perdew_1983,Sham_1983} namely the underestimation of the occupied to unoccupied bands energy gap at the LDA Kohn-Sham level, was dramatically reduced, bringing the agreement with experiment to within a very few tenths of an eV [REFS] with an $\mathcal{O}(N^4)$ computational cost scaling (see below). As another important feature compared to other perturbative techniques, the $GW$ formalism can tackle finite size and periodic systems, and does not present any divergence in the limit of zero gap (metallic) systems. \cite{Campillo_1999} However, remaining a low order perturbative approach starting with mono-determinental mean-field solutions, it is not intended to explore strongly correlated systems. \cite{Verdozzi_1995}\\
\noindent{\textbf{Neutral excitations.}} While TD-DFT starts with the variation of the charge density $\rho$ with respect to an external local perturbation, the BSE formalism considers a generalized 4-points susceptibility that monitors the variation of the Green's function with respect to a non-local external perturbation:
that relates the full (interacting) $G$ to the Hartree $G_0$ obtained by replacing the $\lbrace\varepsilon_s, f_s \rbrace$ by the Hartree eigenvalues and eigenfunctions.
The derivation by $U$ of the Dyson equation yields :
where it is traditional to neglect the derivative $(\partial W /\partial G)$ that introduces again higher orders in $W$. Taking the static limit $W(\omega=0)$ for the screened Coulomb potential, that replaces thus the static DFT exchange-correlation kernel, and expressing equation~\ref{DysonL} in the standard product space $\lbrace\phi_i({\bf r})\phi_a({\bf r}')\rbrace$ where (i,j) and (a,b) index occupied and virtual orbitals, leads to an eigenvalue problem similar to the so-called Casida's equations in TD-DFT :
where $\lambda$ index the electronic excitations. The $\lbrace\phi_{i/a}\rbrace$ are the input (Kohn-Sham) eigenstates used to build the $GW$ self-energy. The resonant part of the BSE Hamiltonian reads:
\item the non-local screened Coulomb matrix elements replaces the DFT exchange-correlation kernel.
\end{itemize}
We emphasise that these equations can be solved at exactly the same cost as the standard TD-DFT equations once the quasiparticle energies and screened Coulomb potential $W$ are inherited from preceding $GW$ calculations. This defines the standard (static) BSE/$GW$scheme that we discuss in this Perspective, emphasizing its pros and cons. \\
Originally developed in the framework of nuclear physics, \cite{Salpeter_1951}
the use of the BSE formalism in condensed-matter physics emerged in the 60s at the semi-empirical tight-binding level with the study of the optical properties of simple semiconductors. \cite{Sham_1966,Strinati_1984,Delerue_2000}
Three decades latter, the first \textit{ab initio} implementations, starting with small clusters \cite{Onida_1995,Rohlfing_1998} and extended semiconductors and wide-gap insulators, \cite{Albrecht_1997,Benedict_1998,Rohlfing_1999}
paved the way to the popularization in the solid-state physics community of the BSE formalism.
Following early applications to periodic polymers and molecules, [REFS] the BSE formalism gained much momentum in the quantum chemistry community with in particular several benchmarks \cite{Korbel_2014,Jacquemin_2015a,Bruneval_2015,Jacquemin_2015b,Hirose_2015,Jacquemin_2017,Krause_2017,Gui_2018} on large molecular systems performed with the very same running parameters (geometries, basis sets) than the available reference higher-level calculations such as CC3. Such comparisons were grounded in the development of codes replacing the planewave solid-state physics paradigm by well documented correlation-consistent Gaussian basis sets, together with adequate auxiliary bases when resolution-of-the-identity techniques were used. [REFS]
An important conclusion drawn from these calculations was that the quality of the BSE excitation energies are strongly correlated to the deviation of the preceding $GW$ HOMO-LUMO gap with the experimental (IP-AE) photoemission gap. Standard $G_0W_0$ calculations starting with Kohn-Sham eigenstates generated with (semi)local functionals yield much larger HOMO-LUMO gaps than the input Kohn-Sham one, but still too small as compared to the experimental (AE-IP) value. Such an underestimation of the (IP-AE) gap leads to a similar underestimation of the lowest optical excitation energies.
Such a residual HOMO-LUMO gap problem can be significantly improved by adopting exchange-correlation functionals with a tuned amount of exact exchange that yield a much improved Kohn-Sham HOMO-LUMO gap as a starting point for the $GW$ correction. \cite{Bruneval_2013,Rangel_2016,Knight_2016} Alternatively, self-consistent schemes, where corrected eigenvalues, and possibly orbitals, are reinjected in the construction of $G$ and $W$, have been shown to lead to a significant improvement of the quasiparticle energies in the case of molecular systems, with the advantage of significantly removing the dependence on the starting point functional. \cite{Rangel_2016,Kaplan_2016,Caruso_2016} As a result, BSE excitation singlet energies starting from such improved quasiparticle energies were found to be in much better agreement with reference calculations such as CC3. For sake of illustration, an average 0.2 eV error was found for the well-known Thiel set comprising more than a hundred representative singlet excitations from a large variety of representative molecules.
\cite{Jacquemin_2015a,Bruneval_2015,Gui_2018,Krause_2017} This is equivalent to the best TD-DFT results obtained by scanning a large variety of global hybrid functionals with varying fraction of exact exchange.
A very remarkable success of the BSE formalism lies in the description of the charge-transfer (CT) excitations, a notoriously difficult problem for TD-DFT calculations adopting standard functionals. \cite{Dreuw_2004}
Such a difficulty can be ascribed to the lack of long-range electron-hole interaction with local XC functionals. Such a problem can be cured within TD-DFT by including a long-range component in the kernel through an exact exchange contribution, a solution that explains in particular the success of range-separated hybrids for the description of CT excitations. \cite{Stein_2009} The analysis of the screened Coulomb potential matrix elements in the BSE kernel (see Eqn.~\ref{Wmatel}) reveals that such long-range (non-local) electron-hole interactions are properly described, including in environments (solvents, molecular solid, etc.) where screening reduces the long-range electron-hole interactions. The success of the BSE formalism to treat CT excitations has been demonstrated in several studies, \cite{Blase_2011b,Baumeier_2012,Duchemin_2012,Sharifzadeh_2013,Cudazzo_2010,Cudazzo_2013} opening the way to important applications such as doping, photovoltaics or photocatalysis in organic systems. We now leave the description of successes to discuss difficulties and Perspectives.\\
\noindent{\textbf{The computational challenge.}} As emphasized above, the BSE eigenvalue equation in the occupied-to-virtual product space is formally equivalent to that of TD-DFT or TD-Hartree-Fock. As such, searching iteratively for, typically, the lowest eigenstates presents the same computational cost within BSE and TD-DFT. The main bottleneck resides in the preceding calculations of the $GW$ quasiparticle energies. Within a planewave approach, or using resolution-of-the-identity techniques combined with localized basis sets, $GW$ calculations scale as $\mathcal{O}(N^4)$ with system size. Such a cost is mainly associated with calculating the free-electron susceptibility $\chi_0(\omega)$ at various frequencies with its entangled summations over occupied and virtual states. Pooling empty states with common energy denominators, \cite{Bruneval_2008}
or replacing the sum over unoccupied states by iterative techniques \cite{Umari_2010,Giustino_2010} as already done in TDDFT, \cite{Walker_2006} are efficient techniques that do not change however the scaling with system size.
Another approach, that has regained recently much interest, lies in the so-called space-time approach by Rojas and coworkers. \cite{Rojas_1995} Borrowing the idea of Laplace transform formulations, already used in quantum chemistry perturbation theories, \cite{Almlof_1991,Haser_1992} combined with a real-space grid formulation, the susceptibility can be factorized so as to decoupled summations over occupied and virtual states:
with $\tilde{\phi}_{i}({\bf r})={\phi_ai}({\bf r}) e^{\varepsilon_a \tau}$ and $\tilde{\phi}_{a}({\bf r})={\phi_a}({\bf r}) e^{-\varepsilon_a \tau}$, taking the zero of energy at the chemical potential and with real orbitals. Such an approach leads to cubic scaling algorithms, independently of any arguments exploiting localization or sparcity in the limit of large systems. Such an approach has been recently adapted to cubic scaling RPA calculations \cite{} and is now blooming in quantum chemistry thanks to the concept of Interpolative Separable Density Fitting (ISDF) that allows decoupling occupied and virtual orbitals entangled in standard resolution of identity density- or coulomb fitting coefficients.
Combined further with a stochastic treatment of virtual space sampling, impressive linear scaling formalisms could be established, paving the way to applying the $GW$/BSE formalism to systems comprising several hundred atoms on standard size laboratory clusters.
\noindent{\textbf{The challenge of Analytic gradients.}}\\
An additional issue concerns the formalism taken to calculate the ground-state energy for a given atomic configuration. Since the BSE formalism presented so far concerns the calculation of the electronic excitations, namely the difference of energy between the GS and the ES, gradients of the ES absolute energy require
This points to another direction for the BSE foramlism, namely the calculation of GS total energy with the correlation energy calculated at the BSE level. Such a task was performed by several groups using in particular the adiabatic connection fluctuation-dissipation theorem (ACFDT), focusing in particular on small dimers. \cite{Olsen_2014,Holzer_2018b,Li_2020,Loos_2020}
The analysis of the singlet-triplet splitting is central to numerous applications such as singlet fission, thermally activated delayed fluorescence (TADF) or
stability analysis of restricted closed-shell solutions at the HF \cite{Seeger_1977} and TD-DFT \cite{Bauernschmitt_1996} levels.
contaminating as well TD-DFT calculations with popular range-separated hybrids (RSH) that generally contains a large fraction of exact exchange in the long-range. \cite{Sears_2011}
While TD-DFT with RSH can benefit from tuning the range-separation parameter as a mean to act on the triplet instability, \cite{Sears_2011} BSE calculations do not offer this pragmatic way-out since the screened Coulomb potential that builds the kernel does not offer any parameter to tune.
a first cure was offered by hybridizing TD-DFT and BSE, namely adding to the BSE kernel the correlation part of the underlying DFT functional used to build the susceptibility and resulting screened Coulomb potential $W$. \cite{Holzer_2018b}
\noindent{\textbf{The Concept of dynamical properties.}}
As a chemist, it is maybe difficult to understand the concept of dynamical properties, the motivation behind their introduction, and their actual usefulness.
Here, we will try to give a pedagogical example showing the importance of dynamical quantities and their main purposes. \cite{Romaniello_2009,Sangalli_2011,ReiningBook}
To do so, let us consider the usual chemical scenario where one wants to get the neutral excitations of a given system.
In most cases, this can be done by solving a set of linear equations of the form
where $\omega$ is one of the neutral excitation energies of the system associated with the transition vector $\bx$.
If we assume that the operator $\bA$ has a matrix representation of size $K \times K$, this \textit{linear} set of equations yields $K$ excitation energies.
However, in practice, $K$ might be very large, and it might therefore be practically useful to recast this system as two smaller coupled systems, such that
where the blocks $\bA_1$ and $\bA_2$, of sizes $K_1\times K_1$ and $K_2\times K_2$ (with $K_1+ K_2= K$), can be associated with, for example, the single and double excitations of the system.
Note that this \textit{exact} decomposition does not alter, in any case, the values of the excitation energies, not their eigenvectors.
Solving separately each row of the system \eqref{eq:lin_sys_split} yields
\begin{subequations}
\begin{gather}
\label{eq:row1}
\bA_1 \bx_1 + \tr{\bb}\bx_2 = \omega\bx_1,
\\
\label{eq:row2}
\bx_2 = (\omega\bI - \bA_2)^{-1}\bb\bx_1.
\end{gather}
\end{subequations}
Substituting Eq.~\eqref{eq:row2} into Eq.~\eqref{eq:row1} yields the following effective \textit{non-linear}, frequency-dependent operator
which has, by construction, exactly the same solutions than the linear system \eqref{eq:lin_sys} but a smaller dimension.
For example, an operator $\Tilde{\bA}_1(\omega)$ built in the basis of single excitations can potentially provide excitation energies for double excitations thanks to its frequency-dependent nature, the information from the double excitations being ``folded'' into $\Tilde{\bA}_1(\omega)$ via Eq.~\eqref{eq:row2}. \cite{Romaniello_2009,Sangalli_2011,ReiningBook}
How have we been able to reduce the dimension of the problem while keeping the same number of solutions?
To do so, we have transformed a linear operator $\bA$ into a non-linear operator $\Tilde{\bA}_1(\omega)$ by making it frequency dependent.
In other words, we have sacrificed the linearity of the system in order to obtain a new, non-linear systems of equations of smaller dimension.
This procedure converting degrees of freedom into frequency or energy dependence is very general and can be applied in various contexts. \cite{Gatti_2007,Garniron_2018}
Thanks to its non-linearity, Eq.~\eqref{eq:non_lin_sys} can produce more solutions than its actual dimension.
However, because there is no free lunch, this non-linear system is obviously harder to solve than its corresponding linear analogue given by Eq.~\eqref{eq:lin_sys}.
Nonetheless, approximations can be now applied to Eq.~\eqref{eq:non_lin_sys} in order to solve it efficiently.
One of these approximations is the so-called \textit{static} approximation, which corresponds to fix the frequency to a particular value.
For example, as commonly done within the Bethe-Salpeter formalism, $\Tilde{\bA}_1(\omega)=\Tilde{\bA}_1\equiv\Tilde{\bA}_1(\omega=0)$.
In such a way, the operator $\Tilde{\bA}_1$ is made linear again by removing its frequency-dependent nature.
This approximation comes with a heavy price as the number of solutions provided by the system of equations \eqref{eq:non_lin_sys} has now been reduced from $K$ to $K_1$.
Coming back to our example, in the static approximation, the operator $\Tilde{\bA}_1$ built in the basis of single excitations cannot provide double excitations anymore, and the only $K_1$ excitation energies are associated with single excitations.