%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import rasterio
import geojson
import firedrake
import icepack, icepack.plot, icepack.models

Inverse problems

In this demo, we’ll revisit the Larsen Ice Shelf. This time, we’re going to estimate the fluidity coefficient \(A\) in Glen’s flow law

\[\dot\varepsilon = A\tau^3\]

from observational data. In the previous demos, we’ve come up with some value of the fluidity coefficient and computed a velocity field by solving an elliptic partial differential equation. The fluidity coefficient is roughly a known function of the ice temperature, together with some fudge factors for crystal fabric or large-scale damage, so we know an approximate range of values that it could take. Nonetheless, we don’t have large-scale measurements of the fluidity coefficient from remote sensing like we do for ice velocity and thickness.

Instead, we can try to come up with a value of \(A\) that gives a velocity field closest to what we observed. This idea can be turned into a constrained optimization problem. The quantity we wish to optimize is the misfit between the computed velocity \(u\) and the observed velocity \(u^o\):

\[E(u) = \frac{1}{2}\int_\Omega\left(\frac{u - u^o}{\sigma}\right)^2dx,\]

where \(\sigma\) are the standard deviations of the measurements.

One constraint is that the fluidity field has to be positive. Inequality constraints can require substantially more sophisticated numerical methods. To avoid this problem, we’ll cheat our way out by reparameterizing \(A\) in terms of a new variable \(\theta\):

\[A = A_0e^\theta.\]

No matter the value of \(\theta\), \(A\) is always positive. To make this change, we’ll give the IceShelf object our own custom-made function for calculating the viscous part of the action functional, just like we did for the friction in the last demo.

In addition to minimizing the misfit, we also want to have a relatively smooth value of the parameter field \(\theta\). The regularization functional \(R\) is included to penalize oscillations of size \(\Theta\) over a given length scale \(L\):

\[R(\theta) = \frac{L^2}{2\Theta^2}\int_\Omega|\nabla \theta|^2dx.\]

Finally, let \(F(u, \theta)\) be the weak form of the shallow shelf equations, again using the new parameter \(\theta\) instead of the fluidity \(A\). The physics constraint for our problem is that \(F(u, \theta) = 0\). We can enforce this constraint by introducing the Lagrange multiplier \(\lambda\), in which case the combined objective functional is

\[J(u, \theta; \lambda) = E(u) + R(\theta) + \langle F(u, \theta), \lambda\rangle.\]

We can calculate the derivative of this functional with respect to \(\theta\) by using the adjoint method. We can then use a descent method to iterate towards a critical point, which is hopefully close to the true value of the fluidity coefficient.

Input data

The input data are the same as from the previous demo of the Larsen Ice Shelf.

outline_filename = icepack.datasets.fetch_larsen_outline()
with open(outline_filename, 'r') as outline_file:
    outline = geojson.load(outline_file)

geometry = icepack.meshing.collection_to_geo(outline)
with open('larsen.geo', 'w') as geo_file:
!gmsh -2 -format msh2 -v 2 -o larsen.msh larsen.geo
mesh = firedrake.Mesh('larsen.msh')
features = [feature['geometry'] for feature in outline['features']]
xmin, ymin, xmax, ymax = np.inf, np.inf, -np.inf, -np.inf
δ = 50e3
for feature in outline['features']:
    for line_string in feature['geometry']['coordinates']:
        xs = np.array(line_string)
        x, y = xs[:, 0], xs[:, 1]
        xmin, ymin = min(xmin, x.min() - δ), min(ymin, y.min() - δ)
        xmax, ymax = max(xmax, x.max() + δ), max(ymax, y.max() + δ)

image_filename = icepack.datasets.fetch_mosaic_of_antarctica()
with rasterio.open(image_filename, 'r') as image_file:
    height, width = image_file.height, image_file.width
    transform = image_file.transform
    window = rasterio.windows.from_bounds(left=xmin, bottom=ymin, right=xmax, top=ymax,
                                          width=width, height=height, transform=transform)
    image = image_file.read(indexes=1, window=window, masked=True)

def subplots(*args, **kwargs):
    fig, axes = icepack.plot.subplots()
    xmin, ymin, xmax, ymax = rasterio.windows.bounds(window, transform)
    axes.imshow(image, extent=(xmin, xmax, ymin, ymax),
                cmap='Greys_r', vmin=12e3, vmax=16.38e3)

    return fig, axes
fig, axes = subplots()
kwargs = {
    'interior_kw': {'linewidth': .25},
    'boundary_kw': {'linewidth': 2}
icepack.plot.triplot(mesh, axes=axes, **kwargs)

Just like in the 2nd demo, we’ll apply the smoothing filter to the thickness, which is necessary to get a reasonable driving stress.

thickness_filename = icepack.datasets.fetch_bedmachine_antarctica()
thickness = rasterio.open('netcdf:' + thickness_filename + ':thickness', 'r')

Q = firedrake.FunctionSpace(mesh, family='CG', degree=2)
h0 = icepack.interpolate(thickness, Q)

from firedrake import inner, grad, dx
h = h0.copy(deepcopy=True)
α = firedrake.Constant(2e3)
J = 0.5 * ((h - h0)**2 + α**2 * inner(grad(h), grad(h))) * dx
F = firedrake.derivative(J, h)
firedrake.solve(F == 0, h)

In addition to the velocities themselves, we also need the estimates of the velocity measurement errors. The fidelity of the measurements tells us how good a fit to the data we should expect.

velocity_filename = icepack.datasets.fetch_measures_antarctica()
vx = rasterio.open('netcdf:' + velocity_filename + ':VX', 'r')
vy = rasterio.open('netcdf:' + velocity_filename + ':VY', 'r')
stdx = rasterio.open('netcdf:' + velocity_filename + ':STDX', 'r')
stdy = rasterio.open('netcdf:' + velocity_filename + ':STDY', 'r')

V = firedrake.VectorFunctionSpace(mesh, family='CG', degree=2)
u_obs = icepack.interpolate((vx, vy), V)
σx = icepack.interpolate(stdx, Q)
σy = icepack.interpolate(stdy, Q)

Here we’ll plot the velocity errors. You can see from the stripey pattern that they depend on the particular swath from the observational platform.

σ = firedrake.interpolate(firedrake.sqrt(σx**2 + σy**2), Q)
fig, axes = subplots()
contours = icepack.plot.tricontourf(σ, 20, alpha=0.9, axes=axes)

We need to make an initial guess for the fluidity parameter. In this case, we’ll use the same value as in the second demo – a constant fluidity assuming a temperature of -13C.

T0 = 260
A0 = icepack.rate_factor(T0)
from icepack.constants import glen_flow_law as n
def viscosity(u, h, θ):
    A = A0 * firedrake.exp(θ)
    return icepack.models.viscosity.viscosity_depth_averaged(u, h, A)

θ = firedrake.Function(Q)

ice_shelf = icepack.models.IceShelf(viscosity=viscosity)
opts = {'dirichlet_ids': [2, 4, 5, 6, 7, 8, 9], 'tol': 1e-6}
u = ice_shelf.diagnostic_solve(u0=u_obs, h=h, θ=θ, **opts)
fig, axes = subplots()
contours = icepack.plot.tricontourf(u, 20, alpha=0.6, axes=axes)

Inferring the fluidity

There are four parts that go into an inverse problem:

  • a physics model

  • an initial guess for the parameter and state

  • an error metric

  • a smoothness metric

We already have the physics model and some initial guesses. The next step is to write a pair of Python functions that will create the model-data misfit functional and the regularization functional. We’ll pass these functions to the inverse problem when we create it.

import icepack.inverse

def objective(u):
    δu = u - u_obs
    return 0.5 * ((δu[0] / σx)**2 + (δu[1] / σy)**2) * dx

Θ = firedrake.Constant(1.)
L = firedrake.Constant(19.5e3)
def regularization(θ):
    return 0.5 * (L / Θ)**2 * inner(grad(θ), grad(θ)) * dx

Now we create the InverseProblem object. We’ve already mentioned several objects that the inverse problem needs – the model, the initial guess, some functionals, etc. Additionally, it needs to know the name of the observed field and the parameter (the state_name and parameter_name) arguments, since these values are passed to the forward solver as keyword arguments.

All the additional arguments to the forward model are passed as a dictionary model_args. In our case, these consist of the thickness field, the initial guess for the velocity, and the forward solver tolerance. Finally, to specify the inverse problem, we need to know where Dirichlet boundary conditions are to be applied, as this affects how one solves for the Lagrange multiplier field \(\lambda\).

problem = icepack.inverse.InverseProblem(
    model_args={'h': h, 'u0': u, 'tol': 1e-6},

Once we’ve created the problem, we then create a solver object that will iteratively search for a good value of the parameters. The inverse solver has lots of moving parts, all of which are wrapped in a class that inherits from icepack.inverse.InverseSolver. In our case, we’ll be using the Gauss-Newton method, which is implemented in GaussNewtonSolver. Using this class should save you from worrying about too many low-level details, but still provide a good amount of flexibility and transparency.

As a convenience, the inverse solver can take in a function that it will call at the end of every iteration. For this demonstration, we’ll have it print out the values of the misfit and regularization functionals. You could also, say, make a plot of the state and parameter guess at every iteration to make a movie of how the algorithm progresses.

area = firedrake.assemble(firedrake.Constant(1) * dx(mesh))
def print_error_and_regularization(solver):
    E = firedrake.assemble(solver.objective)
    R = firedrake.assemble(solver.regularization)
    print('{:g}, {:g}'.format(E/area, R/area))

To create the solver object, we only need to give it a problem and optionally the function it will call on each iteration. (In programming parlance this is called a callback function.)

As a general principle, it’s a good idea to separate out the specification of a problem, in this case represented by the InverseProblem class, from the method used to solve the problem, represented by the GaussNewtonSolver class.

solver = icepack.inverse.GaussNewtonSolver(problem, print_error_and_regularization)
20.2803, 0

Before setting the solver loose, let’s look at the initial search direction.

fig, axes = subplots()
contours = icepack.plot.tricontourf(solver.search_direction, 20, alpha=0.6,
                                    cmap='RdBu', axes=axes)

The search direction is obtained by multiplying the inverse of the Gauss-Newton matrix \(H\) by the gradient \(dJ\) of the objective function. The Gauss-Newton matrix is dense, so we don’t actually build the matrix directly. Instead, the solver contains a procedure to multiply a vector by \(H\), which is all that’s necessary for using iterative methods to solve linear systems. Computing the search direction like this is time-consuming, but results in far fewer iterations, so it’s a net win.

The solve method takes in a relative convergence tolerance, an absolute tolerance, and a maximum number of iterations, and it returns the total number of iterations necessary to achieve the given tolerances. In our case, we’ll stop once the relative decrease in the objective function from one iteration to the next is less than 1/200.

The algorithm takes a while to run. Now would be the time to put on a fresh pot of coffee.

iterations = solver.solve(
2.124, 0.201318
1.08424, 0.400006
1.0537, 0.379332
1.03116, 0.394948

The algorithm converges in just a few iterations because of how good a search direction we get from using the Gauss-Newton approximation. Other methods like gradient descent take many more iterations to reach the same agreement with the data.


Now that we’re done, we’ll want to do some post-processing and analysis on the fluidity parameter that we inferred. The inverse problem object stores the parameter we’re inferring and the observed field as the properties parameter and state respectively. The names are intentionally not specific to just ice shelves. For other problems, we might instead be inferring a friction coefficient rather than a fluidity, or we might be observing the thickness instead of the velocity. You can see all the publicly visible properties by typing help(inverse_problem).

First, let’s plot the parameter value and the fluidity.

θ = solver.parameter
fig, axes = subplots()
levels = np.linspace(-3, 3, 61)
contours = icepack.plot.tricontourf(
    θ, levels=levels, alpha=0.6, cmap='RdBu_r', extend='both', axes=axes
A = firedrake.interpolate(A0 * firedrake.exp(θ), Q)
fig, axes = subplots()
levels = np.linspace(0, 32, 9)
contours = icepack.plot.tricontourf(
    A, levels=levels, alpha=0.6, extend='max', axes=axes

The fluidity is much higher around areas of heavy crevassing, such as the rift opening from the Gipps Ice Rise and the area flowing into it. Additionally, several areas downstream of the inlets have much higher fluidity, and these might indicate the formation of marine ice.

The fluidity is substantially lower around the upper left edge of the ice shelf. Some of the ice might actually be grounded here, in which case the resulting basal drag would tend to reduce the extensional strain of the glacier. However, since the only tunable variable for explaining the observed velocities is the fluidity, the algorithm will erroneously converge on whatever value of the fluidity can reproduce the observed values. In this case, the result is a very low value of \(A\), but for other problems the bias can go in the other direction.

To see how well the parameters fit the data, let’s look at the weighted difference between the computed and observed velocities.

fig, axes = subplots()
δu = firedrake.interpolate((solver.state - u_obs)**2/(2*σ**2), Q)
levels = np.linspace(0, 6, 13)
contours = icepack.plot.tricontourf(
    δu, levels=levels, alpha=0.6, cmap='Reds', extend='max', axes=axes

The computed ice velocity is mostly similar to observations aside from a few blips. The most widespread departures from observations occur around the rifts that flow into the Gipps Ice Rise. We regularized the problem by looking only for smooth values of the fluidity parameter. As a consequence, we won’t be able to see sharp changes that might result from features like crevasses or rifts. We might instead try to use the total variation functional

\[R(\theta) = L\int_\Omega|\nabla\theta|dx\]

if we were interested in features like this.

Finally, let’s try and see how much the inferred parameter departed from our naive initial guess.

print(icepack.norm(solver.parameter) / np.sqrt(area))
print(firedrake.assemble(solver.objective) / area)

The objective function has been reduced by a factor of 10 through the optimization procedure, and our final approximation departs quite substantially from the initial guess. This suggests that data assimilation does give a substantial benefit over an ad-hoc approach like picking a sensible constant value.


In this demo, we’ve shown how to back out the fluidity of an ice shelf from remote sensing observations. We could then use this value, together with a description of how the fluidity evolves, to initialize a prognostic model of the ice shelf. For example, we might assume that the fluidity is a function of ice temperature and damage. The evolution equations for these fields are fairly simple hyperbolic PDE for which we can write solvers using firedrake.

The value of the fluidity that we obtained is not at all spatially homogeneous. Unless we were very clever, we probably couldn’t have come up with some way to parameterize it to get a reasonable guess.

We would expect from statistical estimation theory that the value of the misfit functional divided by the shelf area will be around 1. (A sum of squares of normal random variables has a \(\chi^2\) distribution, which has mean 1, there are two components of the velocity vector, and we divide by 2 at the end.) The misfit we obtained once the algorithm has converged is very close to 1, which is a sign that the regularization parameter has been dialed in just right. However, we don’t always get so lucky. Sometimes you can’t fit the data as well as you expect from statistics, and there are a number of possible “failure modes”:

  1. The error estimates \(\sigma\) could be wrong.

  2. We don’t have a good way to also account for thickness errors, which are substantial.

  3. We regularized the problem too much.

  4. The ice shelf becomes grounded on some pinning point and we didn’t add basal drag.

  5. I implemented the numerical optimization algorithm incorrectly.

Failure modes 1 and 2 happen because we don’t have the right statistical distribution for the errors, while failure mode 3 happens because we don’t have the right prior distribution. Mode 4 is a more insidious type of failure. In this case, the physics model doesn’t actually describe the true behavior of the system, and as a consequence is unable to reproduce the observations with any value of the inputs. Some wrong physics models may nonetheless be able to reproduce the observations, provided that they are controllable as a function of the input parameters. Diagnosing this type of failure is arguably the most difficult. Last but not least is human error in implementing the optimization algorithms. These kinds of failures should be caught through testing on synthetic problems.