Presentation

Let \Omega_h be a triangular (2d) or tetrahedral (3d) mesh of \Omega\subset\R^d corresponding to the following data structure:

\mbox{\begin{tabular}{lccll}
\hline
\textbf{name} & \textbf{type} & \textbf{dimension} & \textbf{description}\\
\hline
$\nq$ & integer & 1 & number of vertices\\
$\nme$ & integer & 1 & number of elements\\
$\q$   & double  & $d \times \nq$ &
\begin{minipage}[t]{7.9cm}
array of vertices coordinates. $\q(\nu,j)$ is the $\nu$-th coordinate of the $j$-th vertex,
$\nu\in\{1,\hdots,d\}$, $j\in\{1,\hdots,\rm{n_q}\}.$
The $j$-th vertex will be also denoted by $\rm{q}^j$
\end{minipage}\\
$\me$  & integer & $(d+1) \times \nme$ &
\begin{minipage}[t]{7.9cm}
connectivity array. $\me(\beta,k)$ is the storage index of the $\beta$-th vertex
of the $k$-th element, in the array~$q$, for $\beta\in\{1,\hdots,(d+1)\}$ and $k\in\{1,\hdots,{\nme}\}$
\end{minipage}\\
$\rm volumes$  & double & $1\times {\nme}$ &
\begin{minipage}[t]{7.9cm}
array of volumes in 3d or areas in 2d. ${\rm volumes}(k)$ is the $k$-th element volume (3d)
or element area (2d),
$k\in\{1,\hdots,{\nme}\}$
\end{minipage}\\
\hline
\end{tabular}}

  • Scalar finite elements :

    Let \HUnHD{\DOMH}=\{v \in {\cal C}^0(\overline \DOMH),\ \  {v}_{|T} \in {\cal P}_1(T), \ \forall T \in {\cal T}_h \}, be the finite dimensional space spanned by the {\cal P}_1-Lagrange (scalar) basis functions \{\varphi_i\}_{i\in\ENS{1}{\nq}} where {\cal P}_1(T) denotes the space of all polynomials defined over T of total degree less than or equal to 1. The functions \varphi_i satisfy

    \varphi_i(\rm{q}^j)=\delta_{i,j},\ \ \forall (i,j)\in\{1,\hdots,\rm{n_q}\}^2

    Then we have \varphi_i\equiv 0 on T_k, \forall k\in\{1,\hdots,{\nme}\} such that \q^i\notin T_k.

    A {\cal P}_1-Lagrange (scalar) finite element matrix is of the generic form

    \MAT{D}_{i,j}=\int_\DOMH \mathcal{D}(\varphi_j,\varphi_i) d\DOMH,\ \
\forall (i,j)\in\ENS{1}{\nq}^2

    where \mathcal{D} is the bilinear differential operator (order 1) defined for all u,v\in \HUnHD{\DOMH} by

    \mathcal{D}(u,v)=\DOT{\MAT{A}\GRAD u}{\GRAD v}
  -\left(u\DOT{\vecb{b}}{\GRAD v} - v \DOT{\GRAD u}{\vecb{c}}\right)
  +d uv

    and \MAT{A}\in (\LInfD{\DOMH})^{d\times d}, \vecb{b}\in(\LInfD{\DOMH})^{d}, \vecb{c}\in(\LInfD{\DOMH})^{d} and d\in\LInfD{\DOMH} are given functions on \DOMH.

    We can notice that \MAT{D} is a sparse matrix due to support properties of \varphi_i functions.

    Let \ndfe=d+1. The element matrix \MAT{D}_{\il,\jl}(T_k), associated to \MAT{D}, is the \ndfe\times\ndfe matrix defined by

    \MAT{D}_{\il,\jl}(T_k)=\int_{T_k} \mathcal{D}(\varphi_{\jl}^k,\varphi_{\il}^k)(\q) d\q,\ \
\forall (\il,\jl)\in\{1,\hdots,\ndfe\}^2

    where \varphi_{\il}^k=\varphi_{\me(\il,k)} is the \il-th local basis function associated to the k-th element.

    For examples,

    • the Mass matrix is defined by

      \MAT{M}_{i,j}=\int_\DOMH \BasisFunc_j \BasisFunc_i d\DOMH

      The corresponding bilinear differential operator \mathcal{D} is completely defined with \MAT{A}=0, \vecb{b}=\vecb{c}=0 and d=1.

    • the Stiffness matrix is defined by

      \MAT{S}_{i,j}=\int_\DOMH \DOT{\GRAD\BasisFunc_j}{\GRAD \BasisFunc_i} d\DOMH

      The corresponding bilinear differential operator \mathcal{D} is completely defined with \MAT{A}=\MAT{I}, \vecb{b}=\vecb{c}=0 and d=0.

  • Vector finite elements :

    The dimension of the space (\HUnHD{\DOMH})^d is \ndf=d\,\nq.

    • in dimension d=2, we have two natural basis :

      the global alternate basis \mathcal{B}_a defined by

      \mathcal{B}_a=\{\FoncBaseDeuxD_1,\hdots,\FoncBaseDeuxD_{2\rm{n_q}}\}=\left\{
\begin{pmatrix}  \varphi_1 \\ 0 \end{pmatrix},
\begin{pmatrix}  0  \\ \varphi_1 \end{pmatrix},
\begin{pmatrix}  \varphi_2 \\ 0 \end{pmatrix},
\begin{pmatrix}  0  \\ \varphi_2\end{pmatrix}, \hdots,
\begin{pmatrix} \varphi_{\rm n_q} \\ 0 \end{pmatrix},
\begin{pmatrix} 0  \\ \varphi_{\rm n_q} \end{pmatrix}
\right\}

      and the global block basis \mathcal{B}_b defined by

      \mathcal{B}_b=\{\FoncBaseDeuxDB_1,\hdots,\FoncBaseDeuxDB_{2\rm{n_q}}\}=\left\{
\begin{pmatrix}  \varphi_1 \\ 0 \end{pmatrix},
\begin{pmatrix}  \varphi_2 \\ 0 \end{pmatrix},\hdots,
\begin{pmatrix}  \varphi_{\rm n_q} \\ 0 \end{pmatrix},
\begin{pmatrix}  0  \\ \varphi_1 \end{pmatrix},
\begin{pmatrix}  0  \\ \varphi_2\end{pmatrix},\hdots,
\begin{pmatrix} 0  \\ \varphi_{\rm n_q} \end{pmatrix}
\right\}.

    • in dimension d=3, we have two natural basis :

      the global alternate basis \mathcal{B}_a defined by

      \mathcal{B}_a=\{\FoncBaseDeuxD_1,\hdots,\FoncBaseDeuxD_{3\nq}\}=\left\{
\begin{pmatrix}  \varphi_1 \\ 0 \\ 0 \end{pmatrix},
\begin{pmatrix}  0  \\ \varphi_1 \\ 0 \end{pmatrix},
\begin{pmatrix}  0  \\ 0 \\ \varphi_1 \end{pmatrix}, \hdots,
\begin{pmatrix}  \varphi_\nq \\ 0 \\ 0 \end{pmatrix},
\begin{pmatrix}  0  \\ \varphi_\nq \\ 0 \end{pmatrix},
\begin{pmatrix}  0  \\ 0 \\ \varphi_\nq \end{pmatrix}
\right\}

      and the global block basis \mathcal{B}_b defined by

      \mathcal{B}_b=\{\FoncBaseDeuxDB_1,\hdots,\FoncBaseDeuxDB_{3\nq}\}=\left\{
\begin{pmatrix}  \varphi_1 \\ 0 \\ 0\end{pmatrix},\hdots,
\begin{pmatrix}  \varphi_\nq \\ 0 \\0\end{pmatrix},
\begin{pmatrix}  0  \\ \varphi_1 \\ 0\end{pmatrix},\hdots,
\begin{pmatrix} 0  \\ \varphi_\nq \\0 \end{pmatrix},
\begin{pmatrix}  0  \\ 0\\ \varphi_1 \end{pmatrix},\hdots,
\begin{pmatrix} 0  \\ 0\\ \varphi_\nq \end{pmatrix}
\right\}.

    A {\cal P}_1-Lagrange (vector) finite element matrix in \mathcal{B}_a basis is of the generic form

    \MAT{H}_{l,m}=\int_\DOMH \mathcal{H}(\BasisFuncTwoD_m,\BasisFuncTwoD_l) d\DOMH,\ \
\forall (l,m)\in\ENS{1}{d\nq}^2

    where \mathcal{H} is the bilinear differential operator (order 1) defined by

    \mathcal{H}(\vecb{u},\vecb{v})=\sum_{\il,\jl=1}^d \mathcal{D}^{\il,\jl}(u_\il,v_\jl)

    and \mathcal{D}^{\il,\jl} is a bilinear differential operator of scalar type.

    For example, in dimension d=2, the Elastic Stiffness matrix is defined by

    \mathbb{K}_{m,l}=\int_{\Omega_h} \underline{\pmb{\epsilon}}^t(\psi_l(q))
\MAT{C} \underline{\pmb{\epsilon}}(\psi_m(q))
,
\ \ \forall (m,l) \in \{1,...,2\rm{n_q}\}^2.

    where \underline{\pmb{\epsilon}}=(\epsilon_{xx},\epsilon_{yy},\epsilon_{xy})^t is the strain tensor respectively and \MAT{C} is the Hooke matrix

    \MAT{C}=\begin{pmatrix}
          \lambda+2\mu & \lambda & 0\\
          \lambda & \lambda+2\mu & 0\\
          0 & 0 & \mu
        \end{pmatrix}

    Then the bilinear differential operator associated to this matrix is given by

    \begin{array}{lcl}
\mathcal{H}(\vecb{u},\vecb{v})&=&
\DOT{\begin{pmatrix} \lambda+2\mu & 0\\ 0 & \mu \end{pmatrix}\GRAD u_1 }{\GRAD v_1}
+\DOT{\begin{pmatrix} 0 & \mu\\ \lambda & 0 \end{pmatrix}\GRAD u_2 }{\GRAD v_1}\\
&+&\DOT{\begin{pmatrix} 0 & \lambda\\ \mu & 0 \end{pmatrix}\GRAD u_1 }{\GRAD v_2}+
\DOT{\begin{pmatrix} \mu & 0\\ 0 &\lambda+2\mu\end{pmatrix}\GRAD u_2 }{\GRAD v_2}
\end{array}

We present now three algorithms (base, OptV1 and OptV2 versions) for assembling this kind of matrix.

Note

These algorithms can be successfully implemented in various interpreted languages under some assumptions. For all versions, it must have a sparse matrix implementation. For OptV1 and OptV2 versions, we also need a particular sparse matrix constructor (see Sparse matrix requirement). And, finally, OptV2 also required that the interpreted languages support classical vectorization operations. Here is the current list of interpreted languages for which we have successfully implemented these three algorithms :

  • Python with Numpy and Scipy modules,
  • Matlab,
  • Octave,
  • Scilab

Classical assembly algorithm (base version)

Due to support properties of P_1-Lagrange basis functions, we have the classical algorithm :

Note

We recall the classical matrix assembly in dimension d with \ndfe=d+1.

_images/AssemblyMat_base_algo.png

Figure 43: Classical matrix assembly in 2d or 3d

In fact, for each element we add its element matrix to the global sparse matrix (lines 4 to 10 of the previous algorithm). This operation is illustrated in the following figure in 2d scalar fields case :

_images/AssemblyMat_base.png

Figure 44: Adding of an element matrix to global matrix in 2d scalar fields case

The repetition of elements insertion in sparse matrix is very expensive.

Sparse matrix requirement

The interpreted language must contain a function to generate a sparse matrix M from three 1d arrays of same length Ig, Jg and Kg such that M(Ig(k),Jg(k))=Kg(k) . Furthermore, the elements of Kg having the same indices in Ig and Jg must be summed.

We give for several interpreted languages the corresponding function :
  • Python (scipy.sparse module) : M=sparse.<format>_matrix(Kg,(Ig,Jg),shape=(m,n) where <format> is the sparse matrix format chosen in csc , csr , lil ,...
  • Matlab : M=sparse(Ig,Jg,Kg,m,n), only csc format.
  • Octave : M=sparse(Ig,Jg,Kg,m,n), only csc format.
  • Scilab : M=sparse([Ig,Jg],Kg,[m,n]), only row-by-row format.

Obviously, this kind of function exists in compiled languages. For example, in C language, one can use the SuiteSparse from T. Davis and with Nvidia GPU, one can use Thrust library.

Optimized classical assembly algorithm (OptV1 version)

The idea is to create three global 1d-arrays \vecb{I}_g, \vecb{J}_g and \vecb{K}_g allowing the storage of the element matrices as well as the position of their elements in the global matrix. The length of each array is \ndfes \nme. (i.e. 9 \nme for d=2 and 16 \nme for d=3 ). Once these arrays are created, the matrix assembly is obtained with one of the previous commands.

To create these three arrays, we first define three local 1d-arrays \vecb{K}^e_k, \vecb{I}^e_k and \vecb{J}^e_k of \ndfes elements obtained from a generic element matrix \MAT{E}(T_k) of dimension \ndfe :

_images/OptV1_IgeJgeKge.png

From these arrays, it is then possible to build the three global arrays \vecb{I}_g, \vecb{J}_g and \vecb{K}_g, of size \ndfes\nme \times 1 defined, \forall k \in \ENS{1}{\nme}, \forall l \in\ENS{1}{\ndfes}, by

\begin{array}{lcl}
  \vecb{K}(\ndfes(k-1)+l)&=&\vecb{K}^e_k(l),\\
  \vecb{I}(\ndfes(k-1)+l)&=&\vecb{I}^e_k(l)=\mathcal{I}^k((l-1) \bmod m +1),\\
  \vecb{J}(\ndfes(k-1)+l)&=&\vecb{J}^e_k(l)=\mathcal{I}^k(E((l-1)/ m)+1).
\end{array}

So, for each triangle T_k, we have

_images/AssemblyMat_OptV1_generic.png

Then, a simple algorithm can build these three arrays using a loop over each triangle.

Note

We give the complete OptV1 algorithm

_images/AssemblyMat_OptV1_algo.png

New Optimized assembly algorithm (OptV2 version)

We present the optimized version 2 algorithm where no loop is used. We define three 2d-arrays that allow to store all the element matrices as well as their positions in the global matrix. We denote by \MAT{K}_g, \MAT{I}_g and \MAT{J}_g these \ndfes-by-\nme arrays, defined \forall k \in \{1,\hdots,\nme\}, \forall il \in\{1,\hdots,\ndfes\} by

(1)\MAT{K}_g(il,k)=\vecb{K}^e_k(il),\quad\quad \MAT{I}_g(il,k)=\vecb{I}^e_k(il),\quad\quad \MAT{J}_g(il,k)=\vecb{J}^e_k(il).

The three local arrays \vecb{K}^e_k, \vecb{I}^e_k and \vecb{J}^e_k are thus stored in the k^{th} column of the global arrays \MAT{K}_g, \MAT{I}_g and \MAT{J}_g respectively.

A natural way to build these three arrays consists in using a loop through the triangles T_k in which we insert the local arrays column-wise

_images/AssembleMatV2d.png

Figure 48: Construction of \mathbb{K}_g, \mathbb{I}_g and \mathbb{J}_g with loop

The natural construction of these three arrays is done column-wise. So, for each array there are \nme columns to compute, which depends on the mesh size. To vectorize, we must fill these arrays by row-wise operations and then for each array there are \ndfes rows to compute. We recall that \ndfe does not depend on the mesh size. These rows insertions are represented in Figure 49 . We can also remark that, \forall (\il,\jl)\in\ENS{1}{\ndfe}, with m=\ndfe,

\MAT{K}_g(m(\jl-1)+\il,k)=e^k_{\il,\jl},\ \
\MAT{I}_g(m(\jl-1)+\il,k)=\mathcal{I}^k_\il,\ \
\MAT{J}_g(m(\jl-1)+\il,k)=\mathcal{I}^k_\jl.

where, in scalar fields case, \mathcal{I}^k_\il=\me(\il,k), \forall \il\in\ENS{1}{d+1} and in vector fields case, \mathcal{I}^k_{d a - b}=d \me(a,k)-b, \forall a \in \ENS{1}{d+1}, \forall b \in \ENS{0}{d-1}.

_images/IgJgKg_OptV2.png

Figure 49: Row-wise operations on global arrays \mathbb{I}_g, \mathbb{J}_g and \mathbb{K}_g

As we can see in Figures 50 and 51, it is quite easy to vectorize \MAT{I}_g and \MAT{J}_g computations in scalar fields case by filling these arrays lines by lines :

_images/Build2D_IgJg_OptV2.png

Figure 50: Construction of \mathbb{I}_g and \mathbb{J}_g : OptV2 version for 2d scalar matrix

_images/Build3D_IgJg_OptV2.png

Figure 51: Construction of \mathbb{I}_g and \mathbb{J}_g : OptV2 version for 3d scalar matrix

Using vectorization tools, we can compute \MAT{I}_g and \MAT{J}_g in one line. The vectorized algorithms in 2d and 3d scalar fields are represented by Figure 52.

_images/BuildIgJg_scalar_algo.png

Figure 52: Vectorized algorithms for \mathbb{I}_g and \mathbb{J}_g computations

In vector fields case, we construct the tabular \MAT{T} such that \MAT{T}(\alpha,k)=\mathcal{I}^k_\il, \forall \il\in\ENS{1}{\ndfe}. Then we can vectorize \MAT{I}_g and \MAT{J}_g computations in vector fields case. We represent in Figure 53 these operations in 2d.

_images/IgJgTabularMECA.png

Figure 53: Construction of \mathbb{I}_g and \mathbb{J}_g : OptV2 version in 2d vector fields case

Now, it remains to vectorize the computation of \mathbb{K}_g array which contains all the element matrices associated to \MAT{D} or \MAT{H} : it should be done by row-wise vector operations.

We now describe a vectorized construction of \mathbb{K}_g array associated to a generic bilinear form a. For the mesh element T_k, the element matrix is then given by

\MAT{E}(T_k)=\begin{pmatrix}
   a(\varphi_{i_1^k},\varphi_{i_1^k}) & a(\varphi_{i_1^k},\varphi_{i_2^k}) & a(\varphi_{i_1^k},\varphi_{i_3^k})\\
   a(\varphi_{i_2^k},\varphi_{i_1^k}) & a(\varphi_{i_2^k},\varphi_{i_2^k}) & a(\varphi_{i_2^k},\varphi_{i_3^k})\\
   a(\varphi_{i_3^k},\varphi_{i_1^k}) & a(\varphi_{i_3^k},\varphi_{i_2^k}) & a(\varphi_{i_3^k},\varphi_{i_3^k})
\end{pmatrix}

_images/Build2D_Kg_generic_OptV2.png

Figure 54: Construction of \mathbb{K}_g for Mass matrix

_images/Build2D_Kg_Mass_OptV2.png

Figure 55: Construction of \mathbb{K}_g for Mass matrix