.. _matlab4: .. highlight:: matlab ********************************* MATLAB 4 ********************************* MATLAB is an interpreted language, in the sense that you can type a command and it executes immediately. This is in contrast to a compiled language, which converts your entire code into a format suitable for execution by your computer's processer and then executes it. While MATLAB has what is called a "Just In Time" compiler (JIT) known as the "accelerate" feature, the fact that MATLAB is interpreted sometimes slows it down. However, this has become less true in recent years, as the JIT compiler speeds up a lot of loops that MATLAB used to take a long time to execute. If you want to see the difference that the JIT compiler makes, type ``feature accel off`` into the command line (``feature accel on`` turns it back on). One way that MATLAB code can be sped up is by doing what is referred to as "vectorization," which essentially means that instead of using loops, we attempt to write computations as matrix or array operations on an entire vector/matrix that have been optimized for speed by the programmers that develop MATLAB. Figuring out ways to vectorize code is often more art than science; you either need to learn some of the non-obvious tricks that people have figured out over the years, or come up with them yourself. This lab looks at some ways that you can optimize MATLAB code. My strategy for optimizing MATLAB code is really quite lazy. If there is a really obvious way to vectorize something, I do so from the beginning. If there isn't something obvious, I'll just write it as a loop and run it to see how long it takes. If it runs slowly, then I will think for a little bit about vectorizing it (like 15 minutes maximum). If I can't find an obvious way to vectorize it, I will just write a Fortran program to do it. Even vectorized MATLAB code usually runs slower than unoptimized C or Fortran code, and I can usually write looped Fortran code faster than I can think about how to vectorize something since it is a direct translation of my existing code. However, if you are not a good C or Fortran programmer, then you might find that it is handy to know more vectorization tricks. I have included a document that describes many techniques for vectorizing MATLAB code ("mtt.pdf", also referred to by the MATLAB 2 lab) with the materials for today. It is worth looking through, and is a good source for tricks to optimize many common MATLAB operations. There are also many other places where you can find info on MATLAB vectorization on the web that you can find using your favorite search engine. This lab may take you longer than the class period to complete, so feel free to spend more time on it during the next class (the next class covers more miscellaneous topics that are not as essential to using MATLAB as vectorization). The second homework will also focus on this skill, so this is definitely something to spend some time on as it will certainly help you write efficient MATLAB code. =================== Defining Arrays =================== There are several tricks for defining arrays quickly. Besides the method of evaluating an entire vector (i.e. ``x = 0:0.01:1; y = sin(x);``), plus ``linspace``, ``logspace``, and other pre-defined matrices like ``eye`` (identity matrix), and the ``reshape`` command to transform one matrix shape into another. Additionally, there are a couple of other tricks: * ``repmat`` ("repeat matrix") -- if you have an existing matrix that is repeated multiple times, then ``repmat(A,size)`` concatenates multiple copies of ``A``. Try it out and see how it works (for instance, try giving it a scalar, and then ``[m n]`` for its size argument). * "Tony's trick" (named for the person who figured out how to do it and then spread the word on MATLAB message boards). This trick makes a constant matrix by taking advantage of the fact that you can evaluate a matrix with a logical matrix, and that MATLAB treats even a scalar value as a matrix: ``A = b(ones(m,n))`` creates an m x n matrix with the value of ``b`` everywhere. Basically, MATLAB is evaluating the scalar value at every location in the matrix where it finds a 1, so it assigns 2 to every matrix element. Note that this gives the same result as ``2*ones(m,n)``, but is faster because it avoids repeatedly doing multiplication. * ``diag(A)`` makes a square matrix with the entries of a vector ``A`` on the diagonal. There are additional functions to generate other kinds of sparse matrices that are useful for vectorized matrix generation. ==================== Shifting Arrays ==================== We may want to shift array positions to calculate something. One can shift the elements of a vector by one place using ``A([end 1:end-1])``, or for :math:`{k}` places by using ``A([end-k+1:end 1:end-k])``. This can also be done for a matrix along the first axis using ``A([size(A,1) 1:size(A,1)-1],:)``. You can also do this using the ``circshift`` function, though this trick is faster. This trick will be useful on the homework assignment, both in vector and matrix form. ==================== Array Math ==================== You can do array math using compact notation in MATLAB. Matrices and vectors can be added and subtracted and multiplied by scalars. We have already mentioned the element-wise operators like ``.*``, ``./`` and ``.^``. There are additional tricks, but they are rather specialized so if you need to do some complex array math like multiplying certain slices of a matrix by different numbers, there may be a nice way to do it in a single line. See the file "mtt.pdf" for some examples. It is also helpful to recall that MATLAB was originally designed as a linear algebra package, and consequently it is very fast when doing matrix math. Thus, if there is a way to write a computation as a linear algebra problem (i.e. matrix multiplication, solving a linear system, etc.), then that will most likely give you the most efficient execution. As an example, we will re-cast the sum of a Fourier series into a matrix multiplication problem to illustrate one way of "vectorizing" MATLAB computations. ================= Timing Code ================= MATLAB has a convenient method for timing code. To time a section of code, place ``tic;`` before the code, and ``toc;`` after the code. After running, you will see the elapsed time. However, note that if you include other things like plots and graphics inside of the ``tic toc``, the numbers may not be meaningful because producing graphs can be much slower than the numerical computation. Be sure that you are aware of what you are timing when comparing code! =================== Practice Problem =================== We will look at an example of vectorizing MATLAB code for summing a Fourier series to obtain a "seismogram." This example is taken from Stein and Wysession, *Introduction to Seismology and Earth Structure*. The problem looks at pulses traveling on a string that are reflected at the boundaries, illustrated below. A string of length :math:`{L=1}` is fixed at both ends, and waves travel with a speed :math:`{c=1}`. Because of the fixed ends, waves that reach the boundaries are reflected with the opposite sign. We are interested in determining the signal measured at :math:`{x=0.7}` due to a source at :math:`{x_s=0.2}` (though your code should allow for an arbitrary specification of :math:`{x}` and :math:`{x_s}`). .. figure:: stringwavediagram.* :scale: 100% Schematic for seismogram creation through a sum of normal modes. (a) A pulse is sent from the source point :math:`{x_s=0.2}`, and we would like to calculate the waves detected at a receiver at :math:`{x=0.7}`. The string has a length :math:`{L=1}`, and the signals travel at a speed :math:`{c=1}`. The ends of the string are fixed, so that a pulse arriving at the end is reflected with the opposite sign. (b) Direct arrival of the wave propagating directly from :math:`{x_s}` to :math:`{x}`. The signal should arrive at :math:`{t=0.5}`. (c) First reflected arrival, which reflects off the left boundary, and arrives at :math:`{t=0.9}`. (d) Second reflected arrive, where the direct arrival continues past :math:`{x}`, reflects at the right boundary, and arrives at :math:`{t=1.1}`. This problem can be solved through a normal mode summation. Each mode is given a weight that depends on the source position :math:`{x_s}`, receiver position :math:`{x}`, and source duration :math:`{\tau}`, and the wave amplitude is a function of time :math:`{u(t)}`: .. math:: u(t) = \sum_{n=1}^\infty\,\sin\left(\frac{n\pi x}{L}\right)\sin\left(\frac{n\pi x_s}{L}\right)\exp\left(-\frac{\left(\omega_n \tau\right)^2}{4}\right)\cos\left(\omega_nt\right) The frequencies are :math:`{\omega_n=n\omega_0=n\pi c/L}`. In practice, we will take the sum over a finite number of modes over a finite number of time steps (the two should be ideally be linked, which you will learn about in Signal Processing, though here we will allow for the number of modes and time resolution to be specified independently). 1. First, write a MATLAB function that loops over all modes and all time points to calculate :math:`{u(t)}`. Run the program using a total time of :math:`{T=1.4}`, :math:`{M = 200}` time points, :math:`{N=200}` modes, and :math:`{\tau=0.02}`. Time the execution using ``tic`` and ``toc`` as mentioned at the beginning of the lab. Note that it is best coding practice to do this as a function, which will make this code re-usable for different choices of the parameters. Think about what parameters you will need to pass to this function to make it as general as possible. 2. Now we will try to figure out how to "vectorize" this sum by writing it as a matrix multiplication problem. First, it is helpful to notice that when we compute the sum, the first three factors in the sum do not change as we change time points -- they are only dependent on :math:`{x}`, :math:`{x_s}`, and :math:`{\tau}`, which are constants. Therefore, we can imagine collecting this term into a "time-independent" vector of length :math:`{N}`. Define a vector that represents the :math:`{x}`, :math:`{x_s}`, and :math:`{\tau}` factors in a MATLAB function (if you want to compare the time for executing each of these, you will want to write a separate function), and set it to the appropriate values by using array evaluation (i.e. without looping over :math:`{n}`). 3. The remaining issue is to break down the time dependence. Note that for a single time point :math:`{t_0}`, we are summing :math:`{\cos(\omega_n t_0)}` weighted by the time-independent vector over :math:`{N}` modes. This can be written as the product of a length :math:`{N}` row vector representing the values of :math:`{\cos(\omega_nt_0)}` and the length :math:`{N}` column vector for the time-independent factors discussed above. Write another MATLAB function, eliminating the loop over the different normal modes, so that you are doing a loop over all time points and doing a matrix product during each execution of the loop. Note: you will need to redefine your row vector representing :math:`{\cos(\omega_nt)}` every time step to accommodate the changing value of :math:`{t}`. 4. Now that we have vectorized the normal mode sum, it should be clear that instead of redefining :math:`{\cos(\omega_nt)}` at every time step, we can make this into an :math:`{M\times N}` "time-dependent" matrix, where :math:`{M}` varies over all :math:`{t}` and :math:`{N}` varies over all :math:`{n}`. This will reduce the summation problem into a single instance of matrix multiplication, which MATLAB can carry out very quickly. 5. The one remaining issue: how do we define the :math:`{M\times N}` time-dependent matrix without looping over the whole matrix? Putting our linear algebra thinking caps on again, we realize that an :math:`{M\times N}` matrix is really just the product of a length :math:`{M}` column vector and a length :math:`{N}` row vector -- the :math:`{\omega_nt}` factor can be written as (:math:`{M}` time points as a column vector) :math:`{\times}` (:math:`{N}` modes as a row vector). MATLAB can then easily take :math:`{\cos()}` of a matrix. 6. Finally, implement a function to make a vectorized sum to calculate :math:`{u(t)}` for the same parameters mentioned above. Time the execution, and compare with the looped computation. Your time series should look something like the plot below. .. figure:: stringwave.* :scale: 80% Response calculated for a source at :math:`{x_s=0.2}` of duration :math:`{\tau=0.02}` at point :math:`{x=0.7}` for :math:`{N=200}` modes. The arrival times correspond to the expected values based on the diagram above. One important thing to note here: what we did here was to optimize how MATLAB computes this sum, but all that we did was eliminate the loops to give us code that does the same computations in a shorter time period. We still added up the contributions of :math:`{N}` modes for :math:`{M}` time points in the end, so we didn't save the computer any computations. While you should do this sort of optimization as much as possible, a better (and more clever) way to optimize this problem is to recognize that we are simply doing a Fourier transform of the source function, doing a simple multiplication, and then doing an inverse Fourier transform. Why is this important? Because some smart people in the 60's figured out that you can compute a Fourier transform on the computer faster than our matrix multiplication (though mathematics titan K. F. Gauss apparently knew about this algorithm in the early 1800's). Thus, one can do this calculation faster than our method here using Discrete Fourier Transforms. This is to serve as a reminder that while you can make your MATLAB code more efficient by vectorizing certain computations, you stand to gain even more by coming up with a way to do less computation in the first place.