.. _Presentation-label: Presentation ++++++++++++ Let :math:`\Omega_h` be a triangular (2d) or tetrahedral (3d) mesh of :math:`\Omega\subset\R^d` corresponding to the following data structure: .. math:: \mbox{\begin{tabular}{lccll} \hline \textbf{name} & \textbf{type} & \textbf{dimension} & \textbf{description}\\ \hline $\nq$ & integer & 1 & number of vertices\\ $\nme$ & integer & 1 & number of elements\\ $\q$ & double & $d \times \nq$ & \begin{minipage}[t]{7.9cm} array of vertices coordinates. $\q(\nu,j)$ is the $\nu$-th coordinate of the $j$-th vertex, $\nu\in\{1,\hdots,d\}$, $j\in\{1,\hdots,\rm{n_q}\}.$ The $j$-th vertex will be also denoted by $\rm{q}^j$ \end{minipage}\\ $\me$ & integer & $(d+1) \times \nme$ & \begin{minipage}[t]{7.9cm} connectivity array. $\me(\beta,k)$ is the storage index of the $\beta$-th vertex of the $k$-th element, in the array~$q$, for $\beta\in\{1,\hdots,(d+1)\}$ and $k\in\{1,\hdots,{\nme}\}$ \end{minipage}\\ $\rm volumes$ & double & $1\times {\nme}$ & \begin{minipage}[t]{7.9cm} array of volumes in 3d or areas in 2d. ${\rm volumes}(k)$ is the $k$-th element volume (3d) or element area (2d), $k\in\{1,\hdots,{\nme}\}$ \end{minipage}\\ \hline \end{tabular}} - **Scalar finite elements :** Let :math:`\HUnHD{\DOMH}=\{v \in {\cal C}^0(\overline \DOMH),\ \ {v}_{|T} \in {\cal P}_1(T), \ \forall T \in {\cal T}_h \},` be the finite dimensional space spanned by the :math:`{\cal P}_1`-Lagrange (scalar) basis functions :math:`\{\varphi_i\}_{i\in\ENS{1}{\nq}}` where :math:`{\cal P}_1(T)` denotes the space of all polynomials defined over `T` of total degree less than or equal to :math:`1.` The functions :math:`\varphi_i` satisfy .. math:: \varphi_i(\rm{q}^j)=\delta_{i,j},\ \ \forall (i,j)\in\{1,\hdots,\rm{n_q}\}^2 Then we have :math:`\varphi_i\equiv 0` on :math:`T_k`, :math:`\forall k\in\{1,\hdots,{\nme}\}` such that :math:`\q^i\notin T_k`. A :math:`{\cal P}_1`-Lagrange (scalar) finite element matrix is of the generic form .. math:: \MAT{D}_{i,j}=\int_\DOMH \mathcal{D}(\varphi_j,\varphi_i) d\DOMH,\ \ \forall (i,j)\in\ENS{1}{\nq}^2 where :math:`\mathcal{D}` is the bilinear differential operator (order :math:`1`) defined for all :math:`u,v\in \HUnHD{\DOMH}` by .. math:: \mathcal{D}(u,v)=\DOT{\MAT{A}\GRAD u}{\GRAD v} -\left(u\DOT{\vecb{b}}{\GRAD v} - v \DOT{\GRAD u}{\vecb{c}}\right) +d uv and :math:`\MAT{A}\in (\LInfD{\DOMH})^{d\times d},` :math:`\vecb{b}\in(\LInfD{\DOMH})^{d},` :math:`\vecb{c}\in(\LInfD{\DOMH})^{d}` and :math:`d\in\LInfD{\DOMH}` are given functions on :math:`\DOMH.` .. Let defined the bilineare application, :math:`a_h`, by .. math:: a_h(u,v) = \int_{\Omega_h} \mathcal{A}_h(u,v) d\Omega_h with .. math:: \mathcal{A}_h(u,v)= \sum_{i,j=1}^d a_{i,j} \DP{u}{x_i}\DP{v}{x_j} +\sum_{i=1}^d (b_i u \DP{v}{x_i} +c_i v \DP{u}{x_i}) + a_0 uv where :math:`a_{i,j},\ b_i,\ c_i,\ a_0` are given functions on :math:`\Omega_h.` A generic Finite Element matrix may write .. math:: \MAT{M}_{i,j}=a_h(\varphi_i,\varphi_j) We can notice that :math:`\MAT{D}` is a **sparse** matrix due to support properties of :math:`\varphi_i` functions. Let :math:`\ndfe=d+1.` The element matrix :math:`\MAT{D}_{\il,\jl}(T_k)`, associated to :math:`\MAT{D}`, is the :math:`\ndfe\times\ndfe` matrix defined by .. math:: \MAT{D}_{\il,\jl}(T_k)=\int_{T_k} \mathcal{D}(\varphi_{\jl}^k,\varphi_{\il}^k)(\q) d\q,\ \ \forall (\il,\jl)\in\{1,\hdots,\ndfe\}^2 where :math:`\varphi_{\il}^k=\varphi_{\me(\il,k)}` is the :math:`\il`-th local basis function associated to the :math:`k`-th element. For examples, * the **Mass** matrix is defined by .. math:: \MAT{M}_{i,j}=\int_\DOMH \BasisFunc_j \BasisFunc_i d\DOMH The corresponding bilinear differential operator :math:`\mathcal{D}` is completely defined with :math:`\MAT{A}=0`, :math:`\vecb{b}=\vecb{c}=0` and :math:`d=1.` * the **Stiffness** matrix is defined by .. math:: \MAT{S}_{i,j}=\int_\DOMH \DOT{\GRAD\BasisFunc_j}{\GRAD \BasisFunc_i} d\DOMH The corresponding bilinear differential operator :math:`\mathcal{D}` is completely defined with :math:`\MAT{A}=\MAT{I}`, :math:`\vecb{b}=\vecb{c}=0` and :math:`d=0.` - **Vector finite elements :** The dimension of the space :math:`(\HUnHD{\DOMH})^d` is :math:`\ndf=d\,\nq.` * in dimension :math:`d=2,` we have two natural basis : .. .. math:: \BasisFuncTwoD_{di-\il}= \BasisFunc_i\begin{pmatrix} \delta_{d-1,\il} \\ \vdots \\ \delta_{0,\il} \end{pmatrix}, \ \forall \il \in \ENS{0}{d-1}, \ \forall i \in \ENS{1}{\nq}. the global *alternate* basis :math:`\mathcal{B}_a` defined by .. math:: \mathcal{B}_a=\{\FoncBaseDeuxD_1,\hdots,\FoncBaseDeuxD_{2\rm{n_q}}\}=\left\{ \begin{pmatrix} \varphi_1 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ \varphi_1 \end{pmatrix}, \begin{pmatrix} \varphi_2 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ \varphi_2\end{pmatrix}, \hdots, \begin{pmatrix} \varphi_{\rm n_q} \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ \varphi_{\rm n_q} \end{pmatrix} \right\} and the global *block* basis :math:`\mathcal{B}_b` defined by .. math:: \mathcal{B}_b=\{\FoncBaseDeuxDB_1,\hdots,\FoncBaseDeuxDB_{2\rm{n_q}}\}=\left\{ \begin{pmatrix} \varphi_1 \\ 0 \end{pmatrix}, \begin{pmatrix} \varphi_2 \\ 0 \end{pmatrix},\hdots, \begin{pmatrix} \varphi_{\rm n_q} \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ \varphi_1 \end{pmatrix}, \begin{pmatrix} 0 \\ \varphi_2\end{pmatrix},\hdots, \begin{pmatrix} 0 \\ \varphi_{\rm n_q} \end{pmatrix} \right\}. * in dimension :math:`d=3,` we have two natural basis : the global *alternate* basis :math:`\mathcal{B}_a` defined by .. math:: \mathcal{B}_a=\{\FoncBaseDeuxD_1,\hdots,\FoncBaseDeuxD_{3\nq}\}=\left\{ \begin{pmatrix} \varphi_1 \\ 0 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ \varphi_1 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ 0 \\ \varphi_1 \end{pmatrix}, \hdots, \begin{pmatrix} \varphi_\nq \\ 0 \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ \varphi_\nq \\ 0 \end{pmatrix}, \begin{pmatrix} 0 \\ 0 \\ \varphi_\nq \end{pmatrix} \right\} and the global *block* basis :math:`\mathcal{B}_b` defined by .. math:: \mathcal{B}_b=\{\FoncBaseDeuxDB_1,\hdots,\FoncBaseDeuxDB_{3\nq}\}=\left\{ \begin{pmatrix} \varphi_1 \\ 0 \\ 0\end{pmatrix},\hdots, \begin{pmatrix} \varphi_\nq \\ 0 \\0\end{pmatrix}, \begin{pmatrix} 0 \\ \varphi_1 \\ 0\end{pmatrix},\hdots, \begin{pmatrix} 0 \\ \varphi_\nq \\0 \end{pmatrix}, \begin{pmatrix} 0 \\ 0\\ \varphi_1 \end{pmatrix},\hdots, \begin{pmatrix} 0 \\ 0\\ \varphi_\nq \end{pmatrix} \right\}. A :math:`{\cal P}_1`-Lagrange (vector) finite element matrix in :math:`\mathcal{B}_a` basis is of the generic form .. math:: \MAT{H}_{l,m}=\int_\DOMH \mathcal{H}(\BasisFuncTwoD_m,\BasisFuncTwoD_l) d\DOMH,\ \ \forall (l,m)\in\ENS{1}{d\nq}^2 where :math:`\mathcal{H}` is the bilinear differential operator (order :math:`1`) defined by .. math:: \mathcal{H}(\vecb{u},\vecb{v})=\sum_{\il,\jl=1}^d \mathcal{D}^{\il,\jl}(u_\il,v_\jl) and :math:`\mathcal{D}^{\il,\jl}` is a bilinear differential operator of scalar type. For example, in dimension :math:`d=2,` the **Elastic Stiffness** matrix is defined by .. math:: \mathbb{K}_{m,l}=\int_{\Omega_h} \underline{\pmb{\epsilon}}^t(\psi_l(q)) \MAT{C} \underline{\pmb{\epsilon}}(\psi_m(q)) , \ \ \forall (m,l) \in \{1,...,2\rm{n_q}\}^2. where :math:`\underline{\pmb{\epsilon}}=(\epsilon_{xx},\epsilon_{yy},\epsilon_{xy})^t` is the strain tensor respectively and :math:`\MAT{C}` is the Hooke matrix .. math:: \MAT{C}=\begin{pmatrix} \lambda+2\mu & \lambda & 0\\ \lambda & \lambda+2\mu & 0\\ 0 & 0 & \mu \end{pmatrix} Then the bilinear differential operator associated to this matrix is given by .. math:: \begin{array}{lcl} \mathcal{H}(\vecb{u},\vecb{v})&=& \DOT{\begin{pmatrix} \lambda+2\mu & 0\\ 0 & \mu \end{pmatrix}\GRAD u_1 }{\GRAD v_1} +\DOT{\begin{pmatrix} 0 & \mu\\ \lambda & 0 \end{pmatrix}\GRAD u_2 }{\GRAD v_1}\\ &+&\DOT{\begin{pmatrix} 0 & \lambda\\ \mu & 0 \end{pmatrix}\GRAD u_1 }{\GRAD v_2}+ \DOT{\begin{pmatrix} \mu & 0\\ 0 &\lambda+2\mu\end{pmatrix}\GRAD u_2 }{\GRAD v_2} \end{array} We present now three algorithms (``base``, ``OptV1`` and ``OptV2`` versions) for assembling this kind of matrix. .. note:: These algorithms can be successfully implemented in various interpreted languages under some assumptions. For all versions, it must have a sparse matrix implementation. For ``OptV1`` and ``OptV2`` versions, we also need a particular sparse matrix constructor (see :ref:`Sparse_matrix_requirement`). And, finally, ``OptV2`` also required that the interpreted languages support classical **vectorization** operations. Here is the current list of interpreted languages for which we have successfully implemented these three algorithms : - Python with *Numpy* and *Scipy* modules, - Matlab, - Octave, - Scilab Classical assembly algorithm (``base`` version) ------------------------------------------------- Due to support properties of :math:`P_1`-Lagrange basis functions, we have the classical algorithm : .. note:: We recall the classical matrix assembly in dimension :math:`d` with :math:`\ndfe=d+1.` .. figure:: images/AssemblyMat_base_algo.png :width: 600px :scale: 100% :align: center Classical matrix assembly in 2d or 3d In fact, for each element we add its element matrix to the global sparse matrix (lines 4 to 10 of the previous algorithm). This operation is illustrated in the following figure in 2d scalar fields case : .. figure:: images/AssemblyMat_base.png :width: 600px :scale: 90% :align: center Adding of an element matrix to global matrix in 2d scalar fields case The repetition of elements insertion in **sparse** matrix is very expensive. .. _Sparse_matrix_requirement: Sparse matrix requirement ------------------------- The interpreted language must contain a function to generate a sparse matrix ``M`` from three 1d arrays of same length ``Ig``, ``Jg`` and ``Kg`` such that ``M(Ig(k),Jg(k))=Kg(k)`` . Furthermore, the elements of ``Kg`` having the same indices in ``Ig`` and ``Jg`` must be summed. We give for several interpreted languages the corresponding function : - Python (*scipy.sparse* module) : ``M=sparse._matrix(Kg,(Ig,Jg),shape=(m,n)`` where ```` is the sparse matrix format chosen in ``csc`` , ``csr`` , ``lil`` ,... - Matlab : ``M=sparse(Ig,Jg,Kg,m,n)``, only ``csc`` format. - Octave : ``M=sparse(Ig,Jg,Kg,m,n)``, only ``csc`` format. - Scilab : ``M=sparse([Ig,Jg],Kg,[m,n])``, only ``row-by-row`` format. Obviously, this kind of function exists in compiled languages. For example, in C language, one can use the *SuiteSparse* from T. Davis and with Nvidia GPU, one can use *Thrust* library. Optimized classical assembly algorithm (``OptV1`` version) ------------------------------------------------------------ The idea is to create three global 1d-arrays :math:`\vecb{I}_g,` :math:`\vecb{J}_g` and :math:`\vecb{K}_g` allowing the storage of the element matrices as well as the position of their elements in the global matrix. The length of each array is :math:`\ndfes \nme.` (i.e. :math:`9 \nme` for :math:`d=2` and :math:`16 \nme` for :math:`d=3` ). Once these arrays are created, the matrix assembly is obtained with one of the previous commands. To create these three arrays, we first define three local 1d-arrays :math:`\vecb{K}^e_k,` :math:`\vecb{I}^e_k` and :math:`\vecb{J}^e_k` of :math:`\ndfes` elements obtained from a generic element matrix :math:`\MAT{E}(T_k)` of dimension :math:`\ndfe` : .. figure:: images/OptV1_IgeJgeKge.png :width: 700px :scale: 100% :align: center From these arrays, it is then possible to build the three global arrays :math:`\vecb{I}_g,` :math:`\vecb{J}_g` and :math:`\vecb{K}_g,` of size :math:`\ndfes\nme \times 1` defined, :math:`\forall k \in \ENS{1}{\nme},` :math:`\forall l \in\ENS{1}{\ndfes},` by .. math:: \begin{array}{lcl} \vecb{K}(\ndfes(k-1)+l)&=&\vecb{K}^e_k(l),\\ \vecb{I}(\ndfes(k-1)+l)&=&\vecb{I}^e_k(l)=\mathcal{I}^k((l-1) \bmod m +1),\\ \vecb{J}(\ndfes(k-1)+l)&=&\vecb{J}^e_k(l)=\mathcal{I}^k(E((l-1)/ m)+1). \end{array} So, for each triangle :math:`T_k`, we have .. figure:: images/AssemblyMat_OptV1_generic.png :width: 700px :scale: 100% :align: center Then, a simple algorithm can build these three arrays using a loop over each triangle. .. note:: We give the complete ``OptV1`` algorithm .. figure:: images/AssemblyMat_OptV1_algo.png :width: 600px :scale: 100% :align: center .. _PresentationOptV2-label: New Optimized assembly algorithm (``OptV2`` version) ------------------------------------------------------ We present the optimized version 2 algorithm where no loop is used. We define three 2d-arrays that allow to store all the element matrices as well as their positions in the global matrix. We denote by :math:`\MAT{K}_g,` :math:`\MAT{I}_g` and :math:`\MAT{J}_g` these :math:`\ndfes-by-\nme` arrays, defined :math:`\forall k \in \{1,\hdots,\nme\},` :math:`\forall il \in\{1,\hdots,\ndfes\}` by .. math:: \MAT{K}_g(il,k)=\vecb{K}^e_k(il),\quad\quad \MAT{I}_g(il,k)=\vecb{I}^e_k(il),\quad\quad \MAT{J}_g(il,k)=\vecb{J}^e_k(il). :label: eq_IgJgKgdef The three local arrays :math:`\vecb{K}^e_k,` :math:`\vecb{I}^e_k` and :math:`\vecb{J}^e_k` are thus stored in the :math:`k^{th}` column of the global arrays :math:`\MAT{K}_g,` :math:`\MAT{I}_g` and :math:`\MAT{J}_g` respectively. A natural way to build these three arrays consists in using a loop through the triangles :math:`T_k` in which we insert the local arrays column-wise .. figure:: images/AssembleMatV2d.png :width: 800px :scale: 100% :align: center Construction of :math:`\mathbb{K}_g`, :math:`\mathbb{I}_g` and :math:`\mathbb{J}_g` with loop The natural construction of these three arrays is done column-wise. So, for each array there are :math:`\nme` columns to compute, which depends on the mesh size. To vectorize, we must fill these arrays by row-wise operations and then for each array there are :math:`\ndfes` rows to compute. We recall that :math:`\ndfe` does not depend on the mesh size. These rows insertions are represented in Figure :num:`#figureigjgkgoptv2` . We can also remark that, :math:`\forall (\il,\jl)\in\ENS{1}{\ndfe}`, with :math:`m=\ndfe,` .. math:: \MAT{K}_g(m(\jl-1)+\il,k)=e^k_{\il,\jl},\ \ \MAT{I}_g(m(\jl-1)+\il,k)=\mathcal{I}^k_\il,\ \ \MAT{J}_g(m(\jl-1)+\il,k)=\mathcal{I}^k_\jl. where, in scalar fields case, :math:`\mathcal{I}^k_\il=\me(\il,k),` :math:`\forall \il\in\ENS{1}{d+1}` and in vector fields case, :math:`\mathcal{I}^k_{d a - b}=d \me(a,k)-b,` :math:`\forall a \in \ENS{1}{d+1},` :math:`\forall b \in \ENS{0}{d-1}.` .. _figureIgJgKgOptV2: .. figure:: images/IgJgKg_OptV2.png :width: 800px :scale: 100% :align: center Row-wise operations on global arrays :math:`\mathbb{I}_g,` :math:`\mathbb{J}_g` and :math:`\mathbb{K}_g` As we can see in Figures :num:`#figurebuild2digjgoptv2scalar` and :num:`#figurebuild3digjgoptv2scalar`, it is quite easy to vectorize :math:`\MAT{I}_g` and :math:`\MAT{J}_g` computations in scalar fields case by filling these arrays lines by lines : .. _figureBuild2DIgJgOptV2scalar: .. figure:: images/Build2D_IgJg_OptV2.png :width: 800px :scale: 100% :align: center Construction of :math:`\mathbb{I}_g` and :math:`\mathbb{J}_g` : ``OptV2`` version for 2d *scalar* matrix .. _figureBuild3DIgJgOptV2scalar: .. figure:: images/Build3D_IgJg_OptV2.png :width: 800px :scale: 100% :align: center Construction of :math:`\mathbb{I}_g` and :math:`\mathbb{J}_g` : ``OptV2`` version for 3d *scalar* matrix Using vectorization tools, we can compute :math:`\MAT{I}_g` and :math:`\MAT{J}_g` in one line. The vectorized algorithms in 2d and 3d scalar fields are represented by Figure :num:`figurebuildigjgoptv2scalaralgo`. .. _FigureBuildIgJgOptV2scalaralgo: .. figure:: images/BuildIgJg_scalar_algo.png :width: 600px :scale: 100% :align: center Vectorized algorithms for :math:`\mathbb{I}_g` and :math:`\mathbb{J}_g` computations In vector fields case, we construct the tabular :math:`\MAT{T}` such that :math:`\MAT{T}(\alpha,k)=\mathcal{I}^k_\il,` :math:`\forall \il\in\ENS{1}{\ndfe}.` Then we can vectorize :math:`\MAT{I}_g` and :math:`\MAT{J}_g` computations in vector fields case. We represent in Figure :num:`figurebuild2digjgoptv2vector` these operations in 2d. .. _figureBuild2DIgJgOptV2vector: .. figure:: images/IgJgTabularMECA.png :width: 800px :scale: 100% :align: center Construction of :math:`\mathbb{I}_g` and :math:`\mathbb{J}_g` : ``OptV2`` version in 2d vector fields case Now, it remains to vectorize the computation of :math:`\mathbb{K}_g` array which contains all the element matrices associated to :math:`\MAT{D}` or :math:`\MAT{H}` : it should be done by **row-wise** vector operations. We now describe a vectorized construction of :math:`\mathbb{K}_g` array associated to a generic bilinear form :math:`a`. For the mesh element :math:`T_k`, the element matrix is then given by .. math:: \MAT{E}(T_k)=\begin{pmatrix} a(\varphi_{i_1^k},\varphi_{i_1^k}) & a(\varphi_{i_1^k},\varphi_{i_2^k}) & a(\varphi_{i_1^k},\varphi_{i_3^k})\\ a(\varphi_{i_2^k},\varphi_{i_1^k}) & a(\varphi_{i_2^k},\varphi_{i_2^k}) & a(\varphi_{i_2^k},\varphi_{i_3^k})\\ a(\varphi_{i_3^k},\varphi_{i_1^k}) & a(\varphi_{i_3^k},\varphi_{i_2^k}) & a(\varphi_{i_3^k},\varphi_{i_3^k}) \end{pmatrix} .. figure:: images/Build2D_Kg_generic_OptV2.png :width: 800px :scale: 100% :align: center Construction of :math:`\mathbb{K}_g` for **Mass** matrix .. figure:: images/Build2D_Kg_Mass_OptV2.png :width: 800px :scale: 100% :align: center Construction of :math:`\mathbb{K}_g` for **Mass** matrix