Wednesday, July 24, 2013

What is the distance between two locations?

There are many scripts and functions available in the internet to answer that question, including the MatLab own mapping toolbox and the m_map toolbox.

But for mapping many data points in a reasonable small area (less than 1000km diameter) there is a very fast approximation:

lat_vec is a vector of location latitudes and lon_vec the respective longitudes. The distance in meters between all those locations and a center point clat, clon can be written as:

distances = sqrt( ...
    (lat_vec - clat).^2 +  ...
    ((lon_vec - clon) .* (cosd(lat_vec)+cosd(clat)./2)).^2 ...
    ) .* 110000;

this assumes are small scale Cartesian grid, thus the distances are not great-circle distances. The error on scales < 2000km should be small, thus acceptable for data mapping, but not for navigation or similar purposes.

Thursday, January 13, 2011

Max, Min and other small stuff

Just a few more tips and tricks how to improve every day code parts:
In case you are looking for the absolute maximum of a matrix, you could write:

abs_max = max(max(x));

or the more faster way:

abs_max = max(x(:));

Obviously the same is true for the minimum: min and any other function that usually runs along the first non singleton dimension of a matrix respectively.
This helps in many cases, especially when combined with direct indexing. Just for instance if you want to determine the number of NaN within a matrix. You could do:


gef = find(isnan(x(:)));
num_nan = length(gef);


gives you the number of NaNs. But this is slow and unnecessary, same can be done with a sum of a logical vector:

num_nan = sum(isnan(x(:)));


this further would work with any relation, like finding all elements larger than pi:


num_lg_pi = sum(x(:)>pi);


Another import issue to improve speed of your m-files is direct indexing and comparison optimization. Let's assume we have a matrix Depth(m,n) with depth values of an area and corresponding Lat(m,n) and Lon(m,n) matrices with the positions of each Depth value.


We are looking for the continental shelf break data, thus want to know the grid points that are shallower than 2000m and deeper than 500m.
One method would be:


[gefm gefn] = find(Depth<2000 & Depth>500);


or


gef = find(Depth<2000 & Depth>500);

and then use the found indices in the matrices Lat and Lon. But since simple addition and subtraction are faster than comparisons a faster way would be:


gef = abs(Depth-1250)<750;


and use the resulting logical matrix gef in Lat and Lon, or even use it direct if you plan not to use the results several times:


Lat(abs(Depth-1250)<750)

and

Lon(abs(Depth-1250)<750)

this method is applicable whenever you are looking for data in between two values. 
Now as a last part let's assume you have a large m-file and want to find 'data in the middle' regularly as for the depth above, looking for Data that is larger or equal min_val and smaller or equal max_val. Unfortunately Matlab has no function like:
min_val <= Data <= max_val  
but one easy way of doing that would be to define following function in you m-file header:

inbetween = @(min_val,Data,max_val)  abs(Data-(max_val+min_val)/2) < (max_val-min_val)/2 ;

then within your code you only have to do:

Lat(inbetween(500,Depth,2000))

to access all the desired Lat data that has depth between 500 and 2000. Hope that helps to make the code slimmer and easier to read!
The fastest way probably would be to copy the internal function part:

 abs(Data-(max_val+min_val)/2) < (max_val-min_val)/2

directly into the Lat(...) 

Fast Unwrap

On towards my first Matlab entry:

I often have to 'unwrap' longitudinal data.
I have a vector a matrix of longitudinal data
lon(m,n)
now these get displaced, shifted or what so ever and the original matrix has elements that are >180 or <-180. (Or if you prefer the 0 to 360 space: <0 or >360, or pi respectively)

To plot these data I usually have to unwrap them which I did with:


lon(lon>180) = lon(lon>180) -360;
lon(lon<=-180) = lon(lon<=-180) +360;


but doing so several times, hundreds of times takes some time. Following approach does not need to compare sizes and is significant faster:


lon = lon - 360.*sign(lon).*(1+sign(abs(lon+eps)-180))./2;


the +eps is necessary to prevent sign(0), assuming data is not close to machine precision.
There is no need to add an eps in the middle of the 2nd term, since the last part of term two is zero in case lon is close to zero.
Now you can improve this code to the following, which actually is substantially faster, for the fact that two scalars get added prior to the addition of a scalar to a vector, compared to the above case in which twice a scalar is added to a vector, but with the limitation that you now only check for < and > at then ends or  => and <= at both ends, no longer only the = in one direction:


lon = lon - 360.*sign(lon).*(1+sign(abs(lon)-(180-eps)))./2;

In case you are looking at (lon>=180) and (lon<=-180) data, just turn the -eps into a +eps.

But even in case you know your longitudinal data is displaced only into one direction and your code looks like:


lon(lon>180) = lon(lon>180) -360;


the code above does 'extends' to:


lon = lon - 360.*(1+sign(lon-(180-eps)))./2;


and similar approach for lon >= 180 instead just > as above: change the sign in front of the eps.
Still this code can be excuted much faster if you rewrite it to:


lon = lon - 180 + 180*sign(lon-(180-eps)));

Just how much faster does this code execute, well on my machine you need about 33% less computation time for data with larger and smaller values with the new approach and
for the case with data only displaced into one direction it is about 25%.
But with data on a sphere, cylinder or circle where you do unwrap regularly it can accumulate to a significant amount.

I didn't come up with an even faster way - maybe someone else has an even more sophisticated idea.

oh - I forgot, in case you wondered how this compares to the Matlab internal 'unwrap' - it is way way WAY faster!

First Post

Just realized that I tend to forget many of my Matlab ideas after a while. Therefore this blog to share ideas how to address Matlab problems I do encounter more or less regularly.
Feel free to share your ideas and comments here - in case you want to contribute regularly, I am looking for co-authors!