Researches


I have interest in a number of different areas. Here is the list in no particular order.

numerical linear algebra

I have been doing research and developing software in AMG, iterative algorithms and geometric multigrid. A parallel AMG code SAM (Scalable Algebraic Multigrid) has been developed.

dynamic load balancing

For many practical applications, such as adaptive mesh based calculations, the load changes during the course of parallel computation, making it necessary to balance the load dynamicly, and in parallel. Most of the existing parallel dynamic load balancing algorithms involve two steps: flow calculation and migration. The traditional algorithm for flow calculation is the diffusion algorithms. We have proved that this type of algorithms are "optimal" in terms of the amount of load migration. However we have proposed a new algorithm which is also "optimal", but is much more efficient.

static load balancing (grid partitioning)

the numerical solution of partial differential equations usually involves dividing up the physical domain into small elements or volumes to form a mesh. To solve the problem on a distributed memory parallel computer, the mesh should be decomposed into subdomains, the number of which usually equals the number of processors, with each subdomain assigned to a unique processor. The static load balancing problem is that of how to decompose the mesh into subdomains so that each processor has about the same amount of computation and so that the communication cost between processors is minimized, with the overall aim of minimizing the runtime of the calculation.
Figure: mesh around a car body partitioned into 16 subdomains.

sparse matrices ordering I

solving general nonsymmetric sparse systems in parallel efficiently is a challenging problem. One way of achieve that is to order the system into bordered block-diagonal form, and factorize the diagonal blocks in parallel. A multilevel algorithm MONET has been develop for this purpose. Other ordering algorithms are also being investigated, for example, for minimizing frontal size/bandwidth in the context of frontal/multifrontal solvers.
Figure: A sparse matrix ordered into BBD form with two diagonal blocks.

sparse matrices ordering II

The problem of reordering a matrix to minimize its front size has many applications. A good reordering helps to reduce the memory usage as well as floating point operations of a frontal solver, it also helps when forming incomplete a LU factorization as a preconditioner for Krylov subspace iterative algorithms. We have proposed a multilevel algorithm for reordering sparse symmetric matrices to reduce the wavefront and profile. The algorithm uses a maximal independent vertex set for coarsening and the Sloan algorithm on the coarsest graph. It is shown to produce orderings that are of a similar quality to those obtained using the best existing combinatorial algorithm (the hybrid Sloan algorithm), yet does not require any spectral information and is significantly faster, requiring on average half the CPU time. A paper entitled Multilevel algorithms for wavefront reduction is available for download.

scheduling of message passing

unblocked (asynchronous) communication is usually the most efficient way of message passing in unstructured applications. However when this is not possible due to reasons such as memory consideration, it is necessary to organize the blocked communication to minimize the communication cost. This turns out to be an interesting graph coloring problem.
Figure: mesh around the skirt of a rocket partitioned into 16 subdomains.

numerical optimization

I am interested in local optimization algorithms such as quasi-Newton or Conjugate Gradient algorithms. I am also interested in global optimization algorithms (Genetic Algorithms, a parallel controlled random search algorithm);

CFD solvers

research and development in solvers for CFD, particularly for finite volume/finite element based calculations. These include FAS nonlinear multigrid for SIMPLE based finite volume imcompressible calculations, block correction based parallel multigrid linear solver and AMG solvers for DNS.

Figure: agglomeration based multigrid -- mesh coarsening.

parallelising industrial finite element applications:

FLITE3D is a multigrid Euler code for external aero-dynamics. It is used extensively by British Aerospace in aircraft design and simulation. This code has been efficiently parallelised in the FLITE3D project. It has been used for production since and has been proved highly robust and efficient.

chemical process simulation (Parallel DAE solvers)

chemical processes are frequently simulated using a large set of differential algebraic equations (DAEs). Computer simulation of such chemical processes involves integration of DAEs over the time interval of interest. Implicit integration schemes are normally used for stability. The nonlinear equations resulted from the implicit integration scheme can be solved using Newton's method, with the sparse Jacobian systems solved using a (direct) sparse solver, such as the MA48 in the Harwell subroutine library. This can be extremely computing intensive. Increasingly, efforts are being made to utilise the vector and parallel computers. Effective use of parallel computers for the solution of DAEs presents a great challenge, because the initial value problems are intrinsically sequential, and the components of the solution process, such as the direct sparse solver, are not easily parallelised either. One way of introducing parallelism into the sparse solver is through reordering the matrices into special for for parallel factorization. To this end we have developed the MONET software . Alternative approaches have also been explored, including iterative algorithms with suitable preconditioners. On a more coarse grain level, parallel extrapolation method has been investigated. It has been demonstrated that some speedups can be achieved.

*Figure (top of the page): density distribution around a F18 as iterations evolve, calculated using FLITE3D on a Cray T3E/900.
Hosted by www.Geocities.ws

1