MATLAB 1¶
MATLAB is short for MATrix LABoratory, and is a common high-level tool for numerical analysis in science and engineering. A few things to know about MATLAB:
- Fundamental data type is matrix (double precision floating point numbers)
- Interactive and “interpreted” (there is a “compiler” that makes code run faster), runs
.m
files and stores data using.mat
files - Many, many built-in functions (we will only scratch the surface in this class)
- Built in 2D and 3D graphics capabilities
- Additional toolboxes add further capabilities for extending MATLAB beyond its basic functionality
MATLAB was originally a package for matrix math, designed to provide an interface to linear algebra libraries written in Fortran, but has grown over the years into a software package useful for a range of scientific applications. It is commonly used in academic, research, and industrial settings (which is why geophysicsts should know how to use it). MATLAB currently uses a proprietary version of the LAPACK linear algebra routines.
MATLAB is commercial software (we have a license for it on all of the computers in the Mac Lab). For those that have strong feelings about proprietary software, there are a number of free open source alternatives: GNU Octave, FreeMat, SciLab, R, a combination of Python packages, etc. Some of these are meant to be MATLAB clones – GNU Octave and FreeMat are supposed to run MATLAB code with the same syntax, while the others have a distinct syntax. I have used GNU Octave extensively in my research in addition to MATLAB, as it is fairly easy to install under various Linux distributions. I have found it mostly compatible with existing MATLAB code, though I often have to make a few modifications and sometimes must replace functions that do not exist in Octave (and some Octave code is not compatible with MATLAB). It also does not run as fast. I use Python along with a number of additional packages in my research in place of MATLAB, though that has been born out of the fact that I simply like programming in Python better (rather than the fact that MATLAB is commercial).
Regardless of all that, MATLAB is a fixture in science and engineering, and you will need to be familiar with it in any career path you choose. You will no doubt encounter software for geophysical research that has been written in MATLAB. In this course, we will be doing our lab exercises in the Mac Lab using MATLAB. The computers have been updated to the most recent release, though most of what we cover should work on other versions of MATLAB. You are also free to use one of the open source alternatives, as long as you use one that has the same syntax as MATLAB. If you decide to use something other than version of MATLAB in the Mac Lab, all I ask is that you be sure that MATLAB in the Lab can run all of your code, as I like to run your code myself – if I have to substantially tweak your code to get it to work, I will take points off.
Everything is a Matrix¶
The main philosophy in MATLAB is that everything is a matrix (specifically, a matrix of floating point numbers with double precision). This is very convenient for doing numerical analysis of data, because as we have seen with Python, we very often need to represent arrays of data in scientific settings. Even if you just want to consider a single floating point number in MATLAB, it is treated as a 1x1 matrix.
MATLAB stores matrices using the 1-indexing convention (i.e. if a matrix A
contains 3 elements, then A(1)
is the first element and A(3)
is the last element), and does so using a convention known as “column major order.” This refers to how the data is arranged in the computer’s memory: if we want to create a 2x2 matrix, MATLAB sets aside enough memory to store 4 floating point numbers. However, there are two different ways we could store the 4 matrix entries in memory: we can either store the first row, then the second row, or we can store the first column, then the second column. MATLAB uses the second convention. This is important to know when you read data from a file into MATLAB, because numbers will be read into consecutive locations in memory from the file. We will look at some examples of how to do this in the coming labs.
Everything also is done using double precision floating point numbers in MATLAB. There are no separate types for integers or booleans, rather they are double precision numbers that are treated in a special way. This is convenient in that you do not worry about how to peform operations on different data types. However, the downside is that floating point numbers take up more memory than integers, and much more memory than booleans (booleans require just one bit of storage, while double precision numbers require 64!). If you are dealing with very large datasets that are intentionally stored in a different format to save space, you can sometimes crash MATLAB just loading the data into memory due to the huge increase in memory requirements when you convert to floating point numbers.
Basic Programming Tools¶
MATLAB contains all of the basic programming tools that we have seen so far: loops, conditional statements, functions, arrays, etc., so what we have studied so far will be useful in learning to program in MATLAB. We can manipulate matrices in MATLAB using many of the same tools we used in Python:
Basic operations: MATLAB has many built-in operations, such as
+
,-
,*
,/
,^
(exponentiation). Additionally, many standard mathematical functions are included, such asexp
,sin
,cos
,mod
, etc. One nice trick about MATLAB is that if you can usually apply an operation to an entire matrix: for instancesin(a)
will give you back a matrix with the sine of the original matrix for each matrix entry. You may also perform basic matrix operations such as addition, subtraction, and multiplication directly in MATLAB without having to loop over every entry in the matrix. This is convenient, as it lets you skip many loops in your code.Loops: You can perform calculations using loops with MATLAB. Here is a simple example that does the same thing as we have seen in Python:
for i=1:10 i end
This will print out numbers 1 through 10, much like our simplest loop example in Python. You may have noticed in Python that if you type an expression into the interpreter interactively, Python prints out the result, but if you enter the same thing into a function or script, Python does not print out the result. MATLAB has a different convention: MATLAB always prints out the results of each statement entered (you will see
ans =
, followed by each number), regardless of whether it is done interactively or through a script (though results in a function are not printed out). This is why we did not need to include a print statement in this loop, simply enteringi
is sufficient to print out the result. Another way to print output is to use thedisp
function, which has the advantage of working even within a function. To suppress this output, put a semicolon at the end of each line. (This is one thing I find annoying about MATLAB, and few things irritate me more than forgetting a semicolon on some MATLAB code).Indentation is not important in MATLAB; the
end
statement tells MATLAB when the loop ends. You should still indent your loops the same way we did in Python when submitting code (I will always do so in my examples), but it is not necessary for MATLAB to understand what you are trying to do. The same goes for all other types of code blocks that are described below.To write the same thing as a while loop, we would enter
i = 1; while i <= 10 i i = i+1; end
As you can see, I can suppress output for the initial assignment for
i
with a semicolon, as well as when I incrementi
within the body of the loop.break
andcontinue
work in the same way as in Python when dealing with loops.Conditionals: The syntax for conditional statements is similar to that of loops:
end
tells MATLAB when the loop is finished.i = 5; if i == 1 'i is one' elseif i > 10 'i is greater than 10' else 'i is less than 10' end
This also shows how strings are defined in MATLAB. You must use single quotes to define a string in MATLAB. To include a quote character in a string, precede it with another single quote:
'Eric''s program'
. MATLAB does not support integer data types, but instead converts them to floating point numbers.Functions: You can define functions in MATLAB, however functions must be saved as a separate file, whose name is the function name followed by
.m
. So to create a simple function to add two numbers togetheradd
, I would create a file “add.m” that contains the following:function c = add(a, b) % adds two numbers a and b together, returning the result c = a + b; end
Like Python, MATLAB allows for documentation to be embedded into a function, though unlike Python it uses comments (denoted with
%
) to serve as the documentation rather than a docstring. However, it serves the same purpose, and you can view the documentation for any function (built-in or user defined) using thehelp
function. Typinghelp add
will display the commented line in the MATLAB command window. Rather than using a return statement like in Python, we define the variables that will be returned from a function in the function definition and simply use those in the function body.You can define other functions within a function file, and you can even define functions within functions (though they can only be called from within that function). However, you cannot define functions from the interpreter the way you can in Python. This is something I find very annoying about MATLAB, because I often like to define short, one-time functions from the command line when I am doing interactive work (and a big reason why I use Python more frequently). Functions also do not benefit from MATLAB’s “compiler” and I find that well-written functions in MATLAB can unfortunately run much more slowly than the same commands entered into the interpreter.
MATLAB supports default arguments, though it is more cumbersome to do so than in Python. To define default arguments, MATLAB has a variable
nargin
(short for number of arguments in) that represents the number of inputs given to a function. You can use this value to specify default inputs. For instance, if we want to write a functionexponent
that by default squares a number, we would create a file “exponent.m” that containsfunction expval = exponent(x, y) % raises input x to the power y (default is 2) if nargin == 1 y = 2; end expval = x^y; end
Basically, you can use
nargin
to query if the function is called with fewer than the standard number of parameters, and depending on the number of inputs you can assign default values. However, this is less flexible than we saw in Python: you are only allowed to pick one case for each value ofnargin
, so if you have more than one optional argument you must decide in advance what defaults that corresponds to.Indexing Arrays: You can perform many of the same tricks in MATLAB as you can with Python. We will cover this in more detail in the next lab, but the syntax is quite similar. If
a
is a vector,a(1:2)
returns a matrix containing only the first two elements ofa
,a(end)
returns the last element, anda(4:2:10)
takes every second entry starting with 4 and ending with 10 (inclusive). Note that this is slightly different from python, both in the order of the numbers and in the differences between zero- and one-indexed arrays.Assertions: You can (and should) use assertions in your MATLAB functions just like in Python. Use
assert(<condition>)
These basic tools should get you up and running with MATLAB.
Importing External Data into MATLAB¶
So far, we have been developing programming skills. However, in order to analyze real data, we need to be able to read the data into our programs. With Python, we mostly used raw_input
to do this, but eventually we will need to read data from files. There are several ways to do this in MATLAB:
load
can be used if your file is a text file that only contains your data (i.e. no metadata is allowed, and it must be consistently formatted). MATLAB figures out how many rows and columns are in the data, and creates a matrix from the data. This is the simplest case, and I have provided two example files to use for this: typeload <filename>
to store the matrix in the name of the file. To store the data in a different variable, usea = load(<filename>)
, where you must put the filename in quotes so that MATLAB treats is like a string.The two provided files contain the same data (a 5 column x 20 row matrix), but stored in different row/column ordering: one is ordered by column (MATLAB’s standard way of storing data), the other by row. Try reading both in to see how they are organized differently. If you want to get the “correct” ordering for the row-ordered data, take the transpose of the matrix once it is read in (you can do this in MATLAB using the apostrophe: if
A
is a matrix, thenA'
is the transpose of that matrix).fscanf
can be used to read more complex data. You need to first open the file usingfopen
, then read the data withfscanf
, and then close the file. Here is an example:fid = fopen('columnorder.dat','r'); a = fscanf(fid, '%f', [5 20])'; fclose(fid);
First, we open the file, giving the file name and the
'r'
indicates that we are only reading from the file (not writing to the file). Once the file is open, we get what is called a file handle, which is just a number that tells MATLAB what file we want to read from (this lets us have more than one file open at a time).Next, we use
fscanf
to read from the file specified by the handlefid
. The'%f'
means that we are reading a floating point number, we could also use'%i'
for an integer and'%s'
for a string (this will read a single character). The last entry tells us the shape of the matrix we want to make: it will contain 5 rows and 20 columns, after which I need to take the transpose to get the right matrix back. This is because theload
function actually works backwards relative to how the numbers are ordered in the file (which is howfscanf
read the file). To read the row ordered data, openroworder.dat
and entera = fscanf(fid, '%f', [20 5]);
Finally I close the file handle using
fclose
. As you can see, this is more time consuming thanload
, but if you have data that is formatted in a complex way, this is how you can tell MATLAB how to deal with it.fread
is for reading binary data. I have provided the column order data as a binary file (saved with double precision format with a little endian byte ordering). To read this data, I can do the following:fid = fopen('columnorder_binary_double.dat','rb'); a = fread(fid, Inf, 'double'); fclose(fid); a = reshape(a, [20 5]);
This will read the data into a single vector, which is then reformatted into a 5x20 matrix using
a = reshape(a, [20 5])
. If you wanted to have MATLAB do the reshaping for you, usea = fread(fid, [20 5], 'double')
, analogous to howfscanf
works. If your data was not double precision, or had a byte-ordering that is different from the conventions on this computer, you will need to supply different commands (see the documentation for more details). If you want to try a file with a different byte ordering and format, I have included the same data with a big endian convention and a 32 bit integer format, which you can read usingfread(fid, [20 5], 'int32', 'b')
While binary data has the disadvantage of not being human readable, reading and writing binary data is usually significantly faster than the equivalent text version of the data. If you find yourself waiting frequently for large datasets to read or write, converting to binary can often speed things up.
These are the basic ways to load data, though MATLAB has some additional fuctions to handle text data with delimiters between each matrix entry (dlmread
). See the documentation if you are having trouble reading a data file in a particular format; there is a good chance there may already be a built-in way to handle the data. Some of your homework will require reading data of various formats, so spend some time when doing the exercises learning how these commands work.
Practice¶
Here are some problems to work on in lab today with MATLAB.
- Practice reading the different data types into MATLAB with the files I have provided.
- Rewrite your square root function from the previous lab as a MATLAB function. As a reminder we can calculate the square root of a number
\({s}\) is as follows: (1) take a guess \({x_n}\), (2) check if \({x_n^2=s}\) within some allowed amount of tolerance (if so,
we have a good enough answer and we can exit), (3) if the guess isn’t good enough, make a new guess \({x_{n+1}=(x_n+s/x_n)/2}\) and
return to step 1. Write a function
calcsqrt
to implement this method, taking \({s}\) as an input and returning the square root. Make sure you put an appropriate assertion statement into your code (think about what could cause your calculation to fail), and use print statements to help debug it. Write a test functiontest_calcsqrt
to test your code. To confirm that your code works, you can use the built-in functionsqrt
(or use exponentiation with a power of 0.5) in MATLAB to check your results. - Rewrite your function to test if the entries of a matrix are prime, putting the respective values in a matrix of the same shape. If the entry is prime, use a value of 1 to indicate true, otherwise use a value of 0 (false). As before, it is a good idea to define a helper function to determine if a single number is prime.