%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import rasterio
import geojson
import firedrake
import icepack, icepack.plot, icepack.models

Larsen Ice Shelf

This demo will involve using real data for the Larsen Ice Shelf in the Antarctic Peninsula. The use of real data will mostly change how we set up the simulation. The simulation itself – involving successive prognostic and diagnostic solves of the physics model – is virtually identical to what we saw in the last demo.

To access the data, you’ll need to have a login for EarthData, the web portal through which NASA makes remote sensing data available to the public. Most of the ice sheet remote sensing data produced by American research institutions is hosted at the National Snow and Ice Data Center (NSIDC) and an EarthData login is necessary to access data from NSIDC.

The external data that we will use are:

  • the BedMachine thickness map of Antarctica

  • a velocity map of Antarctica produced as part of the MEaSUREs program, which you can read more about here

  • a satellite image of all of Antarctica taken from MODIS

  • an outline of the Larsen C Ice Shelf, which I created by tracing over this satellite image in a geographic information system.

Rather than manually download these data sets from the websites they’re hosted on, we’ll call a few functions in the module icepack.datasets to fetch them for us. (Internally, these functions use a library called pooch which handles things like caching the data so it doesn’t get downloaded twice, unzipping archived files, and so forth.) One we have these data sets we’ll use the library rasterio to read the gridded data and GeoJSON for the vector data. Pooch, rasterio, and GeoJSON will have been installed along with icepack, so you don’t need to do this yourself.

Geometry

First, we’ll fetch a GeoJSON file describing the outline of the domain. GeoJSON is a common file format for geospatial vector data. In the previous demo, we generated a .geo file describing the outline of the domain, and then called gmsh to create a triangulation of the interior. For this demo, we’ll use a different helper script that will turn our .geojson file into the .geo format that gmsh expects.

To retrieve the external data, we’ll use several functions in the module icepack.datasets. All of these functions start with fetch. These functions retrieve the external data from the internet and put them in a predictable location so they can be found easily later. The files will only be downloaded the first time you fetch them. This caching functionality will come in handy because we’ll be using much of the same data in later demos.

outline_filename = icepack.datasets.fetch_larsen_outline()
print(outline_filename)
/home/daniel/.cache/icepack/larsen.geojson

To read this file we’ll use the GeoJSON package. We won’t go into a great amount of detail about analyzing geospatial vector data here, but a few basic features are worth going over.

with open(outline_filename, 'r') as outline_file:
    outline = geojson.load(outline_file)

From the user’s perspective, a GeoJSON object looks like a big nested dictionary, and somewhere down the line are some arrays of coordinates. Here we’ll access the coordinate reference system (CRS) that the data are stored in. The most common reference systems are standardized and given numeric ID codes by a standards body, the European Petroleum Survey Group (EPSG). The most common CRS for Antarctic data sets is EPSG:3031, a stereographic projection centered on the South Pole.

print(outline['crs']['properties']['name'])
urn:ogc:def:crs:EPSG::3031

The data we care about in this GeoJSON object are the coordinates of all the features. Here we’ll compute a bounding box for the domain to illustrate how one iterates over all of the features. In this case, every feature of this object is a MultiLineString.

features = [feature['geometry'] for feature in outline['features']]
xmin, ymin, xmax, ymax = np.inf, np.inf, -np.inf, -np.inf
δ = 50e3
for feature in outline['features']:
    for line_string in feature['geometry']['coordinates']:
        xs = np.array(line_string)
        x, y = xs[:, 0], xs[:, 1]
        xmin, ymin = min(xmin, x.min() - δ), min(ymin, y.min() - δ)
        xmax, ymax = max(xmax, x.max() + δ), max(ymax, y.max() + δ)

print('{:e}, {:e}, {:e}, {:e}'.format(xmin, ymin, xmax, ymax))
-2.384857e+06, 9.358604e+05, -1.946497e+06, 1.325843e+06

We can plot the domain outline below to see that everything lines up right.

fig, axes = icepack.plot.subplots()

for feature in outline['features']:
    for line_string in feature['geometry']['coordinates']:
        xs = np.array(line_string)
        axes.plot(xs[:, 0], xs[:, 1], linewidth=2)

axes.set_xlabel('meters')
plt.show(fig)
_images/icepack.demo.02-larsen-ice-shelf_11_0.png

But without some additional context you might not know what each segment of the boundary corresponds to on the real ice shelf. To make that context more apparent, we’ll show how to plot things on top of satellite imagery next.

Imagery

We’ll use the Mosaic of Antarctica (MOA) as a background for all the plots we make in the following. This mosaic was created by compiling several hundred images from MODIS. We could also use imagery from other satellites like Landsat-8 if we wanted higher spatial or radiometric resolution.

The image mosaic is stored as a GeoTIFF file. GeoTIFF is a common storage format for geospatial data; it adds georeferencing information on top of the TIFF file format, which is often used for lossless compression of images. The function rasterio.open will give us an object representing the raster data set that we can then read from.

image_filename = icepack.datasets.fetch_mosaic_of_antarctica()
image_file = rasterio.open(image_filename, 'r')

We’ve opened the file but we haven’t read any data yet. The image file covers all of Antarctica, so it would be wasteful to read the entire image. Instead, we’ll read a window that covers the bounding box we calculated above.

height, width = image_file.height, image_file.width
transform = image_file.transform
window = rasterio.windows.from_bounds(left=xmin, bottom=ymin, right=xmax, top=ymax,
                                      width=width, height=height, transform=transform)

Now we can pass the window to the read method of image_file, which will return a numpy array of the image values over the area that we want. The indexes argument specifies that we’re reading only band 1; since this is a grayscale image, that’s all we can read. For RGB or other multi-spectral images, you might want to get more of the spectral bands.

image = image_file.read(indexes=1, window=window, masked=True)

Now we can make a figure showing the image of the Larsen Ice Shelf together with the various segments of the domain boundary. To add in the spatial coordinates of all the image pixels, we’ve passed in the bounding box of the window that we created earlier to imshow via the keyword extent. The vmin and vmax arguments were tuned by trial and error to get a nice contrast level. You can make out where the ice is grounded or floating, where there are ice rises, and if you change the balance quite a bit you can even pick out rifts in the ice shelf.

def subplots(*args, **kwargs):
    fig, axes = icepack.plot.subplots()
    xmin, ymin, xmax, ymax = rasterio.windows.bounds(window, transform)
    axes.imshow(image, extent=(xmin, xmax, ymin, ymax),
                cmap='Greys_r', vmin=12e3, vmax=16.38e3)

    return fig, axes

We’ll use this satellite image on every plot in this demo. Rather than add the same boilerplate code every time, the code above defines a wrapper function that creates figure and axes objects and adds the image to the plot.

fig, axes = subplots()

for feature in outline['features']:
    for line_string in feature['geometry']['coordinates']:
        xs = np.array(line_string)
        axes.plot(xs[:, 0], xs[:, 1], linewidth=2)

axes.set_xlabel('meters')
plt.show(fig)
_images/icepack.demo.02-larsen-ice-shelf_22_0.png

Checking that the domain boundary lines up well with features that are visible in the satellite image is a good sanity check. This way we know that the coordinate systems haven’t been mixed up, that you haven’t accidentally loaded up a mesh of a totally different ice shelf, and so forth.

Meshing

Next we’ll take this GeoJSON object and translate it into a geometry object from pygmsh. The function to do that is contained in the module icepack.meshing. We can then save this to a .geo file and run gmsh on the result, just like we did in the previous demo.

geometry = icepack.meshing.collection_to_geo(outline)
with open('larsen.geo', 'w') as geo_file:
    geo_file.write(geometry.get_code())

The call to gmsh is the same as in the previous demo, but we’ll make the output less verbose by passing the flag -v 2 as well.

!gmsh -2 -format msh2 -v 2 -o larsen.msh larsen.geo

Now that we’ve generated the mesh we can read it just like we did in the previous demo.

mesh = firedrake.Mesh('larsen.msh')

Finally we’ll make a plot of the mesh so that we can see all the boundary IDs. Boundary segments 1 and 3 correspond to the calving terminus and these are where Neumann boundary conditions should be applied. Segment 2 borders the Gipps Ice Rise, and the remaining segments are where ice is flowing in from. To make things a little clearer visually we’re also specifying a muuch smaller line weight for the interior edges compared to the boundary edges.

fig, axes = subplots()
axes.set_xlabel('meters')
kwargs = {
    'interior_kw': {'linewidth': .25},
    'boundary_kw': {'linewidth': 2}
}
icepack.plot.triplot(mesh, axes=axes, **kwargs)
axes.legend()
plt.show(fig)
_images/icepack.demo.02-larsen-ice-shelf_32_0.png

Input data

Next, we have to load the input data, starting with the ice thickness. The BedMachine Antarctica dataset is hosted on NSIDC; the function fetch_bedmachine_antarctica will prompt you for your EarthData login if need be and download the dataset from the internet.

thickness_filename = icepack.datasets.fetch_bedmachine_antarctica()

The data are stored in a NetCDF file. NetCDF is a common storage format for geophysical data, especially in atmospheric science. NetCDF offers a lot more freedom than GeoTIFF in terms of what kind of data can be stored and how, so you have to know something about the schema or layout before you use it. For example, many fields can be stored by name in a NetCDF file, and you have to know what all the names are. The script ncinfo will print out information about all the fields stored in a NetCDF file.

!ncinfo "$thickness_filename"
/usr/bin/sh: ncinfo: command not found

From this information we can see that the name of the field we want is thickness. We can use rasterio to read NetCDF files, but we have to add in a bit of extra magic so that it knows which field to extract from the file. To specify which field we’re reading, we can prepend netcdf: to the beginning of the filename and append :FIELD_NAME to the string we pass to rasterio.open.

thickness = rasterio.open('netcdf:' + thickness_filename + ':thickness', 'r')

Next, we have to fetch the velocity data, which are also hosted on NSIDC. The file is ~6GiB, so if you run this demo yourself, this step could take a while.

velocity_filename = icepack.datasets.fetch_measures_antarctica()
100%|█████████████████████████████████████| 6.97G/6.97G [00:00<00:00, 1.71TB/s]

The fields we want are VX and VY. The velocity dataset is also stored as a NetCDF file; the names of the fields we want are VX and VY.

vx = rasterio.open('netcdf:' + velocity_filename + ':VX', 'r')
vy = rasterio.open('netcdf:' + velocity_filename + ':VY', 'r')

Modeling

Having done all the leg work to make a mesh and get a good set of input data, the modeling itself should be fairly familiar from the last step. We’ll assume that the ice temperature is a uniform -13C.

One thing is substantially different from previous examples. Before, we called the function firedrake.SpatialCoordinate to get some symbolic handles x, y for the mesh coordinates, and we created symbolic expressions to define the input data to our problem analytically. Now that we’re working with real glaciers, we’re reading in all the observational data using rasterio, but Firedrake doesn’t know how to interpolate a rasterio dataset. The function icepack.interpolate works as a layer on top of the firedrake interpolate function and knows what to do with gridded data sets.

Q = firedrake.FunctionSpace(mesh, family='CG', degree=2)
V = firedrake.VectorFunctionSpace(mesh, family='CG', degree=2)

h0 = icepack.interpolate(thickness, Q)
u0 = icepack.interpolate((vx, vy), V)
fig, axes = subplots()
streamlines = icepack.plot.streamplot(u0, precision=1000, density=2500, axes=axes)
fig.colorbar(streamlines, label='meters/year')
plt.show(fig)
_images/icepack.demo.02-larsen-ice-shelf_45_0.png

We’ll set up the model in the same way as the synthetic ice shelf demo, but there are more boundary segments for which we need to apply Dirichlet conditions.

T = 260
A = firedrake.Constant(icepack.rate_factor(T))
ice_shelf = icepack.models.IceShelf()
opts = {'dirichlet_ids': [2, 4, 5, 6, 7, 8, 9], 'tol': 1e-6}

Next, we’ll try to solve the shallow shelf equations to obtain the initial ice velocity.

try:
    u = ice_shelf.diagnostic_solve(u0=u0, h=h0, A=A, **opts)
except:
    print('Oh no, nonlinear solver did not converge!')
Oh no, nonlinear solver did not converge!

And now we’ve got our first taste of how difficult working with real data can be! Why did the nonlinear solver fail to converge? The ice thickness map from BedMachine is fine enough to resolve pretty small-scale topography, even down to the depth of rifts in the ice shelf. Having such a high-resolution dataset means that the thickness can jump over a very small distance. This could result in an unphysically large driving stress, which is shown in the plot below.

from icepack.constants import ice_density as ρ_I, water_density as ρ_W, gravity as g
from firedrake import grad
τ_d = firedrake.interpolate(-1e3 * ρ_I * g * (1 - ρ_I / ρ_W) * grad(h0**2), V)
fig, axes = subplots()
contours = icepack.plot.tricontourf(τ_d, np.linspace(0., 100., 51),
                                    extend='max', axes=axes)
fig.colorbar(contours, label='kPa')
<matplotlib.colorbar.Colorbar at 0x7f951c080790>
_images/icepack.demo.02-larsen-ice-shelf_52_1.png

In many regions the driving stress exceeds 100 kilopascals, which would be reasonable for a steep Greenland outlet glacier, not for a floating ice shelf.

A quick hack around this would be to synthesize a new, smoother thickness field makes that matches the direct observations as well as possible. In other words, we want a field \(h\) that minimizes the following functional:

\[J(h) = \frac{1}{2}\int_\Omega\left((h - h^o)^2 + \alpha^2|\nabla h|^2\right)dx.\]

The first term quantifies the degree to which our modeled thickness \(h\) matches the observations \(h^o\), while the second term quantifies how oscillatory \(h\) is. The factor \(\alpha\) is a length that dictates the distance over which we want to penalize sharp oscillations. I came up with a smoothing length of 2km by trial and error. The following code will find a minimizer of \(J\) by looking for a thickness at which the derivative of \(J\) is zero.

from firedrake import assemble, inner, dx
h = h0.copy(deepcopy=True)
α = firedrake.Constant(2e3)
J = 0.5 * (h - h0)**2 * dx + 0.5 * α**2 * inner(grad(h), grad(h)) * dx
F = firedrake.derivative(J, h)
firedrake.solve(F == 0, h)

The relative difference between the raw and smoothed thickness is around 3.5%. That’s not a huge relative error; the corrections for air or water in the firn column are comparable.

print(assemble(abs(h - h0) * dx) / assemble(h * dx))
0.03386012075948286

Finally, the driving stress of the filtered ice thickness tops out around 20 kPa, which is much more reasonable.

τ_d = firedrake.interpolate(-1e3 * ρ_I * g * (1 - ρ_I / ρ_W) * grad(h**2), V)
fig, axes = subplots()
contours = icepack.plot.tricontourf(τ_d, np.linspace(0., 20., 41),
                                    extend='max', axes=axes)
fig.colorbar(contours, label='kPa')
<matplotlib.colorbar.Colorbar at 0x7f950ec98290>
_images/icepack.demo.02-larsen-ice-shelf_58_1.png

With this regularized input data the nonlinear solver converges. In general, the nonlinear solvers in icepack should always converge on reasonable input data, and sometimes even on unreasonable data too. A sharp gradient in an overly-resolved thickness field, however, will create such large driving stresses as to no longer be reasonable.

u = ice_shelf.diagnostic_solve(u0=u0, h=h, A=A, **opts)

The computed velocity is shown below; the result looks pretty reasonable if we’re going by the speeds.

fig, axes = subplots()
streamlines = icepack.plot.streamplot(u, precision=1000, density=2500, axes=axes)
fig.colorbar(streamlines, label='meters/year')
plt.show(fig)
_images/icepack.demo.02-larsen-ice-shelf_62_0.png

As a further sanity check, we can see how far off the computed velocity is from the observed values. We get a fairly reasonable approximation even with a spatially homogeneous guess for the ice temperature.

print(icepack.norm(u - u0) / icepack.norm(u0))
0.1799444814347266

Ballpark estimate, the surface and basal mass balance of Larsen C are +30 and -30 cm/yr respectively, so we can take the total to be 0. Let’s simulate the evolution of the ice shelf for the next 10 years. The code for this loop should be familiar from the previous example.

a = firedrake.Function(Q)
h_init = h.copy(deepcopy=True)

final_time = 10.
num_steps = 40
dt = final_time / num_steps
for step in range(num_steps + 1):
    h = ice_shelf.prognostic_solve(dt, h0=h, a=a, u=u, h_inflow=h_init)
    u = ice_shelf.diagnostic_solve(u0=u, h=h, A=A, **opts)

In the plot below we’re using the alpha keyword argument to the contouring function. This makes the thickness contours about half-way transparent so we can see the satellite image underneath.

fig, axes = subplots()
contours = icepack.plot.tricontourf(h, 20, alpha=0.6, axes=axes)
fig.colorbar(contours, label='meters')
plt.show(fig)
_images/icepack.demo.02-larsen-ice-shelf_68_0.png

By plotting the difference between the modeled thickness after 10 years and the initial thickness, we can see the propagation of the rifts downstream. This effect is best visualized with a diverging colormap that makes the 0-contour really obvious.

δh = firedrake.Function(Q)
δh.assign(h - h_init)

fig, axes = subplots()
contours = icepack.plot.tricontourf(δh, axes=axes, cmap='RdBu', alpha=0.25,
                                    levels=np.linspace(-10, +10, 11), extend='both')
fig.colorbar(contours, label='meters')
plt.show(fig)
_images/icepack.demo.02-larsen-ice-shelf_70_0.png

The greatest mass losses occurred nearest to the calving terminus. The pattern in mass change near the rifts on the right edge of the domain is more from advecting those oscillations downstream. From the overall pattern, it’s clear that the ice shelf is losing mass. To see exactly how much, we can integrate the total thickness change and divide by the area of the shelf.

area = assemble(firedrake.Constant(1) * dx(mesh))
print(assemble(δh * dx) / area)
-3.898412855477131

Seeing as the simulation ran for 10 years, this isn’t a wildly unrealistic number. Instead of guessing the net accumulation rate, we could improve the fidelity of our simulation by forcing it with the output of a climate model.

Conclusion

In the last demo, we showed how to simulate ice shelf flow using synthetic data. Here we showed how to load in a generated mesh and observational data, and we used this same functionality to simulate a real ice shelf.

Many real data sets require some amount of preprocessing before they can be used for modeling. For example, many velocity data sets have missing pixels or patches due to noise in the optical or radar imagery, and these missing points have to be filled in somehow. The Bedmap2 thickness also contains processing artifacts that are visible as depressions running diagonally across the ice shelf. These artifacts could be removed by using a low-pass filter on the gridded data, although this might also wash out some real features like the many rifts in the ice.

In order to run the simulation, we had to come up with a guess for the ice rheology. The simple choice we made is quite far from the real value and in a subsequent demo we’ll show how to estimate it from observational data.