Glossary of terms used in Mozart

Copyright (c) © this document :started June 11 2003; Mozart : 1990-2003 by

Philippe Fullsack and Erik Demaine


Last update: June 11 2003


A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z

Back to Index

A-node -- (A-NODE)
Refers to a node whose points belong to a algebraic system of equations to be solved

See also: Node



Back to Index

Boundary Conditions
Refers to

See also: PDE-type



Back to Index

Constraints

See also: Penalization



Comm -- (STRUCTURE CLASS for : Communication)

See also: Parallelism



Back to Index

Domain Decomposition
It seems that Domain decompositions methods may be like trees: narrow and deep or wide and shallow. Last year i proposed that wide and shallow could offer a road to increasing resolution for 3d runs.

The basis of such a method would be a dual-primal FETI method in which unknowns are split in 3 groups of 2 variables, as done by Widlund and Li in NYU.

Each group is a 'velocity-pressure' like set of unknowns corresponding to repectively : grid nodes internal a non-overlapping decomposition of the domain, corner nodes of this decomposition (the coarse component) of the method, and 'abstract nodes' given by the system of fluxes across edges of faces of the skeleton (and their dual normal force component)

We may view in this method the 3rd group has a 'topological' glueing Stokes system. I would call depth of method the difference between the resolution of the coarse and fine level. My suggestion was to prepare a quasi inverse (L) for the coarse method and correct one or 2 levels down using the dual-primal gradient iteration. The bulk of the work would be parallel and preconditioning would have to deal mostly with convergence in the topological glueing

This gives rise to a general 'nested set' of 'stokes like' problem. The method is very 'general/abstrast' and interesting. It is unfortunately not presented by Li 'as it should' ..

See also: Uzawa



Back to Index

Efficiency
The question of efficiency arise at all levels in 'software technology'. How efficient can the process of software development be? How efficient is the execution of such and such program designed to solve a particular problem

Mozart has been written with a strong bias towards flexibility, at the expense of efficiency. We had a discussion with Bruce today for that matter.

A book , and probably a bookshelf could be written on this topics.

We count on a 'multilevel approach' in which the same library (Mozart) is used at a high level to tackle a variety of problems or algorithms, possibly inefficiently or suboptimally, than revisit the particular paths visited and write a finely tuned optimized version of these sections.

Examples are too many to be reported here .Of course the requirement that was put on Mozart to handle around 30 local control fields in each cell, with arbitrary definitions of rheologies, etc , with the possibility for the user to etablisha range of relations between fields, with the idea that several clouds of integration should be able to participate to the assembly of the energy, etc while maybe doing some good to the code , certainly do some evil as well.

Anybody is welcome to appreciate that this type of 'evil-overloading' of the code is what typical code users do all the time, paying more (all) attention to which simple and quickly tested addition may be enough to solve the next problem , and less (no) attention to repairing major weaknesses or adressing efficiency.

Crudely one could imagine rating a program on 2 scales : range and efficency. One could call program maturity the ratio of the develpment time devoted to efficiency increase and the time devoted to range increase. Sopale e.g. is very un-mature. Mozart somewhat more mature.

Now one would have to look at parameters such as age (a good estimate of program cost, salaries being a primary cost), maturity and range. Policies or taste would advocate e.g. young mature low-range programs, etc.

Additional difficulties exist if we wanted to use these criteria for a computation of costs, and amortizing.. Hardware, and knowledge are not static. Young mature low-range objects will quickly adapt to novelties. However not every novelty spans a large range, and sooner or later the range problem will reappear. High-range un-mature programs may quickly gain maturity. This is the development philosophy i have chosen for Mozart, at a modest but labor-intensive scale...

Ce qui est sans doute un voeu pieux...

See also: Flexibility



Back to Index

Graph -- (STRUCTURE CLASS for : Graphs)

See also: Structure



Back to Index

Inits

All those things you have to do to get the ball rolling. Used in Mozart e.g. as a subset of problem setup.

So we may speak about T-inits or VF-inits .. for what you think. Users should largely taylor 'inits' to their needs.

See also: Structure



Back to Index

Idiosyncrasy

See also: Structure



Back to Index

Lsystem-- (STRUCTURE CLASS for : linear (or matrix-) systems )

See also: Structure



Back to Index

Media -- (Integration / summation)
Refers to intrinsic properties of a system which takes the form of a set of relations between various subsets of a system state

Media is an extension of the concept of 'lois constitutives' (behaviour law?) used in continuum mechanics.
Particular media are : rheologies, thermologies, densitologies, fologies, where the prefix before '-logies' is evocative of which pair of variables are linked

In the context of continuum mechanics or thermodynamics relations occur primarily as :

  • models of 'body force densities' and
  • relations between pairs of variables which are the dual-primal variables of a scalar product given by the internal energy (or dissipation functional in the case of Onsager fluxes and entropy)

Mozart refers to X-Media where the prefix X is a mnemonic for the type of problem under consideration. (e.g. ADP-Media or T-Media for Advection-diffusion-production systems, or VF-media for viscous fluid systems

See also: Imports , Polar fluids , Viscoplasticity



Memory allocation
In Mozart dynamic allocation is performed through calls to the C routine malloc(). C offers 3 basic routines for dynamic memory allocation malloc, calloc and realloc. The storage classes are (AUTO, REGISTER for local variables exclusively) and (EXTERN STATIC ) for global variables . The organization of a C program in memory will be segmented in - a code segment (in executable form), a data segment cconsisting in the STATIC and EXTERN variables, then a stack of AUTO variables, then a heap of memory freely available to the program. Scopes ,lifetime and place of the variables in the memory system will of course affect the program functionality and efficiency. Bruce , after my complaining to him that my routines were using lots of local variables, was telling me to hint the compiler at placing variables in; or close to CPU. Indeed : we would have to identify how many, what registers, what variables etc on the sp4. Worth doing for fine tuning of most demanding routines only..Hopefully these will be the linear solver's, already optimized...

The code dynamic.c written by Erik for Mozart contains calls to malloc encapsulated as dyn_ialloc, dyn_ralloc, and a small library providing all basic operations for dynamic arrays which not only may be allocated but shrink or grow by request, this itself is an encapsulation of ralloc

I call in the f77 routines a function wdyn(n,0,0,iop) like in node_x(node_i)=wdyn(node_anodes(node_i),0,0,3). iop=2 allocates integer*4 iop=3 allocates real*8 (sizeof is used internally when calling malloc)

the function wdyn(n,struct_type,struct_n,op) was written initially not only to call dyn_ralloc or dyn_ialloc but also to keep a record, structure by structure, and structure_type by structure_type, of all memory allocated so far)

Also initially, well years ago, each structure type had ist own include file : node.h lsystem.h etc so that lib.h = { node.h; lsystem.h; ...} and you would not include a node in a routine not needing any. I got lazy and stopped bothering about the (nice) arguments struct_type,struct_n, which you therefore may choose randomly; or correct consistently in your version of the code.

Notice that i have used 32 bits arithmetics, that i would have to modify the include file to split integer*8 pointers into their own box (how ennoying..) well so far each (major) structure was defined in terms of integers (32 bits) only. I didnt not have to assume a 'prototype', that is no variable was assumed , from the start to be or not a pointer. Bonny told me it was bad programming style. Maybe .I have written a little test program to test how to allocate memory dynamically on the sp4 in 64 bits mode, so the difficulty is 'just' the splitting of variables defining structures in 2 sections : one for pointers , one for non-pointers.

There exist several reasons why (note that the only machine when i encountered the same 'problem' before is the DEC alpha) it would be useful to do that :

  • ? clean distinction between variable types
  • 64 bits arithmetics should NOT be penalized (greatly ? well my understanding is that the registers are built for 64 bits; but the cache behaviour/memory contention might be different for a computation involving 64 versus 32 bits aruthmetics ) in execution time
  • The maximum size available for data (size of heap ? there must also be a limit linked to virtual adressing) depends on the choice of arithmetics (-q32 -q64 flags). -bmaxdata allows to define several data segments by blocks of 256 Mbytes within some limits (2Gbytes for -q32, none for q64) ? The details need to be looked at more carefuly as i just write this out of memory.
  • Accuracy is higher in 64 bits. Penalization is so powerful that it would be the second major argument (for me) for keeping 64 bits beside
  • Halfing memory may be useful for large jobs (some 2d's , all 3d's ?). Notice here that 32 bits arithmetics would require eliminating hanging nodes algebraically, an operation more complex than the penalization i wrote for rondo.

I have most of the time used dynamic allocation in mozart with the following bias: most structures needed would be allocated at the start of the problem to solve, they would all have an ability for initial extra-allocation (user's responsability ). extra-allocation (or over-allocation which is how i refer to it in Mozart) should allow a significant part of the problem (e.g. a long enough time slice) to execute without memory exhaustion/need for redefinition, occasionally memory would be reallocated (ideally: grown. i have used free; allocate; in cello for adaptive computations)

I have not been worried by variables' life time (e.g. a local node is permanent but used only inside the innermost finite element routines), scope has typically be large (often sharing much more than necessay, which is very flexible but also may be a source of concerns (i told you about segmenting the include file. but several routines manipulate several structures simultaneously and often a routine uses only part of the structure)...

See also: Imports



Mesh -- (STRUCTURE CLASS for : meshes, set-covers ..)
Refers to the structure class used to define and manipulate 'meshes' .More specifically a Mesh is a set of entities called elements. They 'typically' correspond to subdomains in a geometric embedding.

Grids may be represented in Mozart as a (Node, Mesh) couple; but meshes may also describe set coverings which may correspond to no grid.

For example a Mesh may be used to represent a tiling of Voronoi cells (dual to the Delaunay triangulation of a Node) or a set of 'patches' with arbitrary overlap.

Such covers are useful in both the so-called meshless methods and in bubble meshing algorithms

Mozart provides several functions for mesh-ing a node into a grid. Several meshers have been developed specifically for Mozart (Quadtree, Octree, Halfner, TensorProduct) , and Jonathan Shewchuk 's exquisite TRIANGLE mesher has been interfaced to Mozart

See also: Imports , Polar fluids , Viscoplasticity



Back to Index

Node -- (STRUCTURE CLASS for : clouds, abstract arrays)

See also: A-node, I-node



Back to Index

^M
Parallelism

No one will be suprised to read here that there is more than onw way to skin a (parallel) cat. I never got any training in 'parallel programming' so when i learned first about parallel machines and MPI i thought message passing was so simple, easy to use and powerful, that it was 'THE' method for writing parallel programs.

I also had heard, maybe from Erik, that MPI could also be the method of choice on shared memory machines. We were exposed to hardware that was on the distributed , not the shared side of the divide with the ALEX and later the SP2 (vector machines such has the NEC-SX or Cray were always crowded..and not 'ours')

This may explain why Mozart is strongly biased on the MPI side. I have always tried to maxmimize the program portability. Erik installed mpich and off we were in parallel on virtually any network. Mozart has hooks to several powerful and free libraries (PSPASES, Triangle, mpich, X11-Xfree86.. Metis... -Ming for the more exotic) and i have always tried to provide functionality regardless of the platform: no need for f90 or openGL or WSMP..

So today Here is what happens: Mozart is from the stone-age of parallelism. I have written constructs which i am sure do exist as simple one-line directives in OpenMP . Look at the LLNL tutorials for the very simple parallel loop opening and private and shared variables.

In this sense Mozart is indeed doing things that compilers (should) do , although compilers have a considerably wider and more complex scope.

I am realizing of course that several things that Mozart do are more complicated than necessary , if you have omp or Pthreads.

We started on the sp2 with an exclusively distributed machine, got more shared parallelism (4 processors) on the sp3 , and 8 today on the sp4, so we definitely should take advantage of this. So far i have always assumed the worse but may be we should relax..

I give the following example. Let us say we relax and say :it's ok , even it's good to use WSMP. Well than why worry , say in 2d, with parallelism. The multithreaded version of WSMP is completely transparent and you have litteraly nothing to do. In essence the routine used to access blkfct can be used to access WSMP: just the CRS graph of the matrix to compute. Yes right but how about other things such as lagrangian tracking parallelization?

Well nothing could be more simple: cut (cyclically : by the mapping k=mod(i,pp) where pp is the number of threads) your seach loop, create pp threads, then join them ; or even simpler use an omp loop. in short: use a thread -say one one of the 4 dual processors, to track the first 1000 particles, another to track the next pack of 1000 etc. You will have notived the critical assumption that the velocity i a shared array available to any thread. All memory is shared on a SMP! Contrast this with what i have done in FAN , the mozart algorithm used for tracking grids -regardless of memory locality.

By the way, -qsmp=omp gets access to OMP , _r suffix in compilers access to Pthreads, mp prefix access to MPI. Not to mention the choice between f90 and f77 e.g

So all the tests i have been doing and programming style are general. They would work on e.g. an array of p690's. or anything else (as long as you have MPI). Actually as long as you can emulate the functions of MPI: send,receive,whoaim,howmany. The COMM structure is meant to provide a simple abstract data type (structure) for communication.

It just would take time to absorb in COMM more of the attributes which typically define communicators, so COMM is not very advanced AS A STRUCTURE in Mozart. That is we use mostly (quasi-exclusively) the number and name of processors comm_pp(comm_i), comm_myid(comm_i).

Of course on my wish list people would come and enrich those. For example : to encode in COMM both the specifics of mpi.h (MPI) and pthreads.h on the sp4 would probably be an asset in 'ripening the code' . I just have no time to do that

If COMM is not rich , the moz.comm.lib routine is useful in providing encapsulation of 'high level' communication 'moves'. To give an example you may with these routines communicate not an array but a structure, a NODE for example. Or in one call get all your nodal field ranges reduced globally, a useful operation before e.g. a contour plot. I thought of using instantiation variable comm_i as a pointer to a pool of 'nodes' in a network for example. so if i had time i would extend moz.comm.lib.f to use say 3 comms , the first would represent communication say across p690 machines (linked into a cluster) with the communication protocol used there (MPI e.g.), the second (MPI: using the US switch ) representing the network of the 4 (MCM) SMP's of the sp4. the last level representing a set of concurrent threads within each SMP : we could add more levels : level 3.a : one thread per dual processor, level 3.b set of threads in each dual processor e.g. for ILP) .. Internal communication could be suppressed (memory is shared) and MPI would link several values of comm_i (that would be different

May be we could realize with comm things that bear some similarities with Java's remote invocation methods and remote procedure calls >e.g.

Explicitly using MPI allows you to always be in MIMD mode, but this is not suitable for every application. SIMD looks more like what we want to do for e.g. Lagrangian tracking. Also a bias has been assumed towards the side of coarse granularity.

To summarize we are driving a racing car like a bike, but it is not too difficult to push the accelerator pedal: stop running your code in 'parallel' ! decent things can be done on each of the 4 SMPs. I am all in favor of writing a layer of OMP/Pthreads. Experiment with those environment variables (threads spin-wait and yield time). It is likely that phases of the computation with high intensity could benefit from the view that there should be more threads than processors, because these phases would not be memory-bandwidth limited (?).

Fit the largest grid on one SMP compatible with the memory requirements and availability, profile it and react with optimization if the linear solver is not consuming most of your ressources.

Coming back on the question of parallelism, in the end the optimal efficiency will be achieved by programs which are written such as to use the hardware ressources well. If a general enough representation of these in terms of its components and relations is available in the program through some structures, it would seem like we could get a finely tunable program. Multithreading is a generic term and hardware implementations could be very different: e.g. in the ressources available per thread, the relation between thread and clock, the number of instructions available to threads (simultaneously, for ILP) . That is how you get all these acronyms: CMP, CMT, FMT, SMT... (on chip-multiprocessing, coarse multithreading, fine multithreading, simultaneous multithreading) ^M

^M See also: A-node, I-node< /xref>^M

^M
^M
^M
PDE-type
To attempt a classification of partial differential equations is a complex story. Broadly speaking , once you have converted the (system of ) PDE's to an explicit form (if the system was initially presented as a set of functional equations in the jets ), you are left with the biggest distinction: is it or not linear.

If it is linear , and if the space under consideration is a homogeneous space, that is the quotient of 2 Lie groups G/H there is a theory like Fourier's theory to solve those beasts. This is where characters, and representations and Helgason and Kirillov etc come..

By using the fact that Fourier maps convolution to products you are left woth a special algebraic object called the symbol. This (for PDE's on fibre bundles over riemannian spaces) is a matrix section on the cotangent bundle. OK fine no more big words.

Classifying these (symbol) matrices classifies the PDES. E.g. do we have 'nontrivial propagation (that is characteristics , or real eigenvectors) or not (hyperbolic/elliptical distinction)

In the applications tackled in Mozart , some fields obey hyperbolic, other , elliptical type PDES..

We should not think that the (mathematical) game is over. The elliptical case seems to be the best understood but many questions remain opened on the complement..

See also: Scalar products



Plot -- (STRUCTURE CLASS for : plots, drawings, X-windows)

See also: A-node, I-node



Back to Index

Quadrature -- (Integration / summation)
Refers to the computation of integrals with respect to some measure (e.g. the Lebesgue measure, measures that are absolutely continous with respect to the latter, Stieljes measures, etc..).

Quadrature offers a way to replace infinite systems by discrete ones. For particular systems (e.g. those considered in mechanics) , integrals are invariants conserved by the (Lie group of symmetry of the) flow

Weak formulations of partial differential equations (PDE's) lead to (matrices of) integrals, and linear PDE's solutions may be expressed as a linear convolution filter (kernel or propagator)acting on the 'essential data ' (as in boundary element methods for example)

The integration business, well , computing numerical values of integrals , has a long path of tremendous ingenuity ( or genius as in in-genius-ity)

It started French ( I am biased) with some very powerful ideas of our idol : Pascal. French , then went British with Newton-Cotes, and then , o well German with an ultimate idea by the King (Gauss) , which maybe has only be challenged by Markov..

Approximate (or exact) integration methods find an estimate of the integral as a weighted sum of function samples. In other words an integral is just and always the weight of an object - and that is what Pascal wanted to compute..

So the fundamental object in integration is a triplet (w,x,f). If we dont want a rule for each different f , it means we probably want a rule of the type (w,x)=argmin(error, f in a certain class)

Fixing sampling points occured to Newton-Cotes-Simpson (regular cloud) , the n Lagrange did the same for irregular clouds. Rules were progressive but their spectral condition number bifurcates (above n=11 for NC rules) and the infinite l1 norm of w tends to infinity: bad conditioning; oscillations, negative weights.

Note that Lagrange's road (yes,French), leads to a system with a Vandermonde matrix. And this is exactly what is done in MLS (moving least square) methods or in Ellipsis (The code developed by the smart fellow Louis Moresi at Monash.

So : ahum there is a conditioning problem. Practical minds ignore such difficulties.

And by the way what did Gauss do? ok : he didnt fix x. He applied a null-rule. That is He took (w,x)=Z(C) : the zero set of the error on some class C.

He chose C in polynomials. Then people like Hermite,Laguerre etc realized this all had to do with the Gram-Schmidt orhogonalization process in a Hilbert space with some 'weight'. It got linked this way with orthogonal polynomials. Well Hilbert (the second King) had not yet shown the simplicity of this story ... yet..

Game over? not quite . Kronrod adressed the 'non-progressive' character of Gauss rules and proposed a (G,K) pair which may be used for error estimation (while i initially thought of the difference G(k+n)-G(k)

Actually i skipped probably the most important episod after Gauss: the Stieltjes (Jan, not his dad) episod.

Game over ? no . Golub and Meurant have benn working quite a bit along the line of adaptive integration. And I think they are right. In fact this has probably be underrated in the 'variational litterature, and this is why i take the trouble of writing this long (more than 5 lines long ) story

Dirk P. Laurie worked (who really invented those ?) on anti-gauss rules. This is quite interesting because it is a non- null rule! Yes it reminds me of acoustics , or active noise control: work out the rule that negates the gaussian error on a class slightly bigger than the one it is designed for . mmmmm

And so we invite you to think about the 'pertinence' of these considerations in adaptive CFD computations...

See also: Noether



Back to Index

Structure

Mozart manipulates objects by calling functions on a set of pointers to structures.

In Mozart, objects are very simple, and consist in a set of arrays, lists or matrices. Each object belong do a certain structure class (NODE, GRAPH, MESH, LSYSTEM, COMM,PLOT..) and is identified by an integer (its 'name' or 'number').

Each structure has 'instantiation variables' such as 'node_i' or mesh_j for respectively the sructure class NODE and MESH.

A fonction computing say the graph of a sparse matrix from a mesh could e.g. be written as call mesh2graph(mesh_i,graph_i)

The style in Mozart is often oscillating between the tendency for abstraction, and the need to solve particular problems. Hence some versions of Mozart took the trouble of defining a language, registering variables and their relations, while in the version which users in DGG will use this summer, this does not exist but lots of effort have been directed at systematic file based IOS.

At the beginning of Mozart the development went on 2 very unequal road: Erik was writing sophisticated C-code with an in-depth knowledge of anything related to computers, but a poor knowledge in other things such as physics and -relatively speaking- mathematics. I had a poor knowledge of programming: basic f77 was all i knew. In particular i had learned a lot about mathematical structures, but nothing on datastructures.

In retrospect it would have been more powerful to write Mozart fully in C. Indeed i realize that in Mozart there seem to be a manual mimicking of an element of the C language : struct , used to define structured data types. The difference is that struct is handled by the compiler and allows operations on structures that are more powerful, e.g. nesting of structures and chains of indirection.

This is not to say that Mozart structures are not structures, but they are not C structures. They are somewhat analogous : i have used naming like node_x, in retrospect equivalent to node.x in C, ( or deeper : node_geometry_type for node.geometry.type). so today may be i would write : struct node { struct ngeometry; struct ntopology; struct nfields; struct nnonnamed_fields; struct ndoc;}; struct ngeometry { int sdim; int tdim; struct geommap;}; struct nnonnamed_fields { int howmany; int length; char *names[MAX];}; etc then struct node *pnode; and then use (*pnode).ngeometry or pnode->ngeometry or deeper pnode->nnonnamed_fields.howmany or pnode->nnonnamed_fields->names[3] to access things.

I do not know how to easily writing a chain of indirections in f77 apart from removing indirections incrementally one at a time by nested calls to %val.

On top of the C preprocessor offers convenient tools for defining macros, setting/gettting environment variables, aliasing, conditional compilation etc . This makes some programs look like a giant aliasing game. Look at TOMPI for example. Its power may be 'abused' : some would consider this programmin style obscure. Try to understand by looking at TOMPI how MPI initialization is mimicked by threads (several platforms : Solaris, AIX, C-threads etc being aliased at preprocessing time to unique names) and you will get the feeling that Mozart is childish and clear in comparison

Anyway C is beautiful.( much more could substanviate this:bit, text manipulation ,portability (gcc) etc)

Wwe only come to realize lately, probably due to our insufficient knowledge of computer science, that to the development of efficient tools would need a reserve of basic datastructures such as heaps,trees,graphs. We have been too busy to consider hooking e.g. Mozart to libraries such as LEDA to provide such tools.

If we look at what is present in Mozart, it will be realized that many of these basic devices are in fact used. CRS graphs, linked lists, maximal independent sets, partitioners, mesh refiners, hierarchical meshers , offer pratical solutions to several problems in Mozart.

The solutions proposed have also typically their bottleneck. Mozarts'original octree mesher is fast but requires memory, the interface to Triangle is powerful but goes through disk, the MPI communication routines should emulate something like TOMPI on shared memory architectures..

At the other extreme of our simple inefficient structures, you could consider very sophisticated pieces of algorithmic artillery, datastructures that lead to provably optimal solutions to given problem. The book written by Bernard Chazelle on the discrepancy method offers a wealth of examples (look-up the soft-heap for the maximum spanning tree problem) which are mathematical endpoints and *most likely* programmer's nightmares..

See also: Data structures



Back to Index

Z

The sign keft by Zorro as a reminder of its power and sense of justice. Or, if repeated, the sound made by a snoring sleeper.

See also: Data structures