## Assimilating sparse data¶

In this demo, we'll look again at estimating the fluidity coefficient $A$ in Glen's flow law

$$\dot\varepsilon = A\tau^3$$

from observational data for the Larsen Ice Shelf. The previous tutorial showed how to solve statistical estimation problems when we can assume that the measured data $u^o$ are a continuous field defined throughout our entire domain. In that setting the model-data misfit functional is

$$E(u) = \int_\Omega\frac{|u - u^o|^2}{2\sigma^2}dx.$$

In this demo, we'll instead focus on sparse measurements -- the observations are defined at a set of isolated points $\{x_k\}$, and the model-data misfit functional is

$$E(u) = \sum_k\frac{|u(x_k) - u^o_k|^2}{2\sigma_k^2}.$$

Assembling and differentiating functionals that include pointwise evaluation, alongside the usual continuous operations like taking a gradient or integrating over the whole spatial domain, is relatively uncommon in finite element modeling packages. Firedrake is the only such package that natively supports defining functionals that involve evaluating a state variable at a discrete point set. Moreover, the ability to evaluate at a discrete point set composes nicely with the automatic differentiation features, so we can use it in optimization problems just like before.

We'll again use the reparameterization trick of inferring the field $\theta$ in

$$A = A_0e^\theta$$

in order to guarantee that the fluidity coefficient is positive. We'll also use the same regularization functional as before:

$$R(\theta) = \frac{L^2}{2\Theta^2}\int_\Omega|\nabla \theta|^2dx.$$

In all other respects, the problem will be identical to the previous tutorial.

#### Input data¶

The input data are the same as from the previous demo on inferring the fluidity of the Larsen Ice Shelf.

import subprocess
import geojson
import firedrake
import icepack

outline_filename = icepack.datasets.fetch_outline("larsen-2015")
with open(outline_filename, "r") as outline_file:

geometry = icepack.meshing.collection_to_geo(outline)
with open("larsen.geo", "w") as geo_file:
geo_file.write(geometry.get_code())

command = "gmsh -2 -format msh2 -v 2 -o larsen.msh larsen.geo"
subprocess.run(command.split())
mesh = firedrake.Mesh("larsen.msh")

The code below is the same boilerplate as in the previous tutorial for plotting simulation results on top of the mosaic of Antarctica image.

import numpy as np
import rasterio
import icepack.plot

coords = np.array(list(geojson.utils.coords(outline)))
δ = 5e3
xmin, xmax = coords[:, 0].min() - δ, coords[:, 0].max() + δ
ymin, ymax = coords[:, 1].min() - δ, coords[:, 1].max() + δ

image_filename = icepack.datasets.fetch_mosaic_of_antarctica()
with rasterio.open(image_filename, "r") as image_file:
image_window = rasterio.windows.from_bounds(
left=xmin,
bottom=ymin,
right=xmax,
top=ymax,
transform=image_file.transform,
)

def subplots(*args, **kwargs):
fig, axes = icepack.plot.subplots(*args, **kwargs)
xmin, ymin, xmax, ymax = rasterio.windows.bounds(
image_window, image_file.transform
)
kw = {
"extent": (xmin, xmax, ymin, ymax),
"cmap": "Greys_r",
"vmin": 12e3,
"vmax": 16.38e3,
}
try:
axes.imshow(image, **kw)
except AttributeError:
for ax in axes:
ax.imshow(image, **kw)

return fig, axes
fig, axes = subplots()
axes.set_xlabel("meters")
kwargs = {
"interior_kw": {"linewidth": 0.25},
"boundary_kw": {"linewidth": 2},
}
icepack.plot.triplot(mesh, axes=axes, **kwargs)
axes.legend();

Just like in the previous demos, we'll apply a smoothing filter to the thickness, which is necessary to get a reasonable driving stress.

import xarray
from firedrake import assemble, Constant, inner, grad, dx

thickness_filename = icepack.datasets.fetch_bedmachine_antarctica()
thickness = xarray.open_dataset(thickness_filename)["thickness"]

Q = firedrake.FunctionSpace(mesh, family="CG", degree=2)
h0 = icepack.interpolate(thickness, Q)

h = h0.copy(deepcopy=True)
α = Constant(2e3)
J = 0.5 * ((h - h0) ** 2 + α ** 2 * inner(grad(h), grad(h))) * dx
F = firedrake.derivative(J, h)
firedrake.solve(F == 0, h)

#### Point data¶

Before, we were able to just interpolate the gridded data directly from xarray datasets to the finite element spaces we were using for the velocity. We're going to have to do a bit more work now that we want to work with the point data directly. Rather than use xarray, we'll read from the NetCDF file using rasterio since it makes certain operations a little easier.

velocity_filename = icepack.datasets.fetch_measures_antarctica()
vx_file = rasterio.open(f"netcdf:{velocity_filename}:VX", "r")
vy_file = rasterio.open(f"netcdf:{velocity_filename}:VY", "r")
stdx_file = rasterio.open(f"netcdf:{velocity_filename}:ERRX", "r")
stdy_file = rasterio.open(f"netcdf:{velocity_filename}:ERRY", "r")

For starters, we'll want to create window objects around the observational data just like we did for reading the mosaic of Antarctica image. The windowed transform of the original data will help us to find which points in the raw data are contained inside the mesh.

window = rasterio.windows.from_bounds(
left=xmin,
bottom=ymin,
right=xmax,
top=ymax,
transform=vx_file.transform,
).round_lengths().round_offsets()
transform = vx_file.window_transform(window)

Just so you can get an idea of how dense the data are, the plot below zooms in on part of the shelf near the Gipps Ice Rise. The data points are shown in orange and the mesh triangles in black; in this region, there are about 10-15 data points in each triangle. In other regions, the mesh is much finer or coarser and so the data density could be different.

xs = np.array(
[
transform * (i, j)
for i in range(window.width)
for j in range(window.height)
]
)

fig, axes = subplots()
firedrake.triplot(mesh, axes=axes, interior_kw={"linewidth": 0.25})
axes.set_xlim((-2.07e6, -2.06e6))
axes.set_ylim((1.135e6, 1.145e6))
axes.scatter(xs[:, 0], xs[:, 1], 4.0, "tab:orange");

Next we'll use the locate_cell method of the mesh to determine which points of the gridded data set are inside the computational domain.

indices = np.array(
[
(i, j)
for i in range(window.width)
for j in range(window.height)
if mesh.locate_cell(transform * (i, j))
]
)
xs = np.array([transform * idx for idx in indices])

The object that represents this point set from the Firedrake side is called VertexOnlyMesh. To create a vertex-only mesh, we need to supply the background domain and the coordinates of the point cloud. This step can take some time because there are a lot of points to go through; under the hood, Firedrake is looking through the background mesh to find which triangle each point lives in.

We're also passing an extra argument to make sure that creating the vertex-only mesh will crash in the event that one of the input points is outside of the domain. This extra flag isn't necessary in our case because we've already made sure that all the points are inside the domain, but it's helpful to know about this flag for debugging later if, say, the sizes of the arrays don't match up.

point_set = firedrake.VertexOnlyMesh(
mesh, xs, missing_points_behaviour="error"
)

We can create function spaces defined on this point set just like we make function spaces defined on triangular meshes. We're specifying the element family as "DG" for discontinuous Galerkin, but strictly speaking this doesn't really matter -- on a point cloud, continuous and discontinuous Galerkin representations are the same.

Δ = firedrake.FunctionSpace(point_set, "DG", 0)

u_o = firedrake.Function(Δ)
v_o = firedrake.Function(Δ)
σ_x = firedrake.Function(Δ)
σ_y = firedrake.Function(Δ)

To fill these functions, we'll extract the values at the right indices of the gridded data and stuff them into the raw data array.

u_o.dat.data[:] = vx[indices[:, 1], indices[:, 0]]
v_o.dat.data[:] = vy[indices[:, 1], indices[:, 0]]
σ_x.dat.data[:] = stdx[indices[:, 1], indices[:, 0]]
σ_y.dat.data[:] = stdy[indices[:, 1], indices[:, 0]]

Finally, we'll make an initial guess for the ice velocity by interpolating the gridded data to the triangular mesh.

V = firedrake.VectorFunctionSpace(mesh, "CG", 2)
u_initial = icepack.interpolate((vx_file, vy_file), V)

We need to make an initial guess for the fluidity parameter. In this case, we'll use the same value as in the second demo -- a constant fluidity assuming a temperature of -13C.

T = Constant(260)
A0 = icepack.rate_factor(T)

def viscosity(**kwargs):
u = kwargs["velocity"]
h = kwargs["thickness"]
θ = kwargs["log_fluidity"]

A = A0 * firedrake.exp(θ)
return icepack.models.viscosity.viscosity_depth_averaged(
velocity=u, thickness=h, fluidity=A
)

model = icepack.models.IceShelf(viscosity=viscosity)
opts = {
"dirichlet_ids": [2, 4, 5, 6, 7, 8, 9],
"diagnostic_solver_type": "petsc",
"diagnostic_solver_parameters": {
"snes_type": "newtontr",
"ksp_type": "gmres",
"pc_type": "lu",
"pc_factor_mat_solver_type": "mumps",
},
}
solver = icepack.solvers.FlowSolver(model, **opts)

θ = firedrake.Function(Q)
u = solver.diagnostic_solve(
velocity=u_initial,
thickness=h,
log_fluidity=θ,
)

Let's look at the computed of the ice velocity starting from our assumption that the fluidity is constant in space.

fig, axes = subplots()
kwargs = {"precision": 1000, "density": 2500, "vmin": 0, "vmax": 750}
icepack.plot.streamplot(u, axes=axes, **kwargs);

Now let's start with a taste of what comes next: interpolating this computed velocity to the vertex only mesh and evaluating the model-data misfit. The real work is happening here in the two calls to interpolate right at the beginning of this cell. The interpolate method can do more than just interpolate an algebraic expression into a finite element space -- it can also interpolate a Function defined on one mesh to a Function defined on a point cloud.

u_interp = firedrake.interpolate(u[0], Δ)
v_interp = firedrake.interpolate(u[1], Δ)

δu, δv = u_interp - u_o, v_interp - v_o

square_error = (δu / σ_x)**2 + (δv / σ_y)**2

N = len(indices)
initial_misfit = assemble(0.5 * square_error * dx) / N
print(initial_misfit)
2019.423034201167

To sum up expressions over the point cloud, we use the assemble function and the measure dx just like we do on ordinary triangular meshes!

#### Inferring the fluidity¶

There are four parts that go into an inverse problem:

• a physics model
• an initial guess for the parameter and state
• an error metric
• a smoothness metric

We already have the physics model and some initial guesses. The physics are wrapped up in the Python function simulation defined below; we'll pass this function when we create the inverse problem.

def simulation(θ):
return solver.diagnostic_solve(
velocity=u_initial, thickness=h, log_fluidity=θ
)

Since the ice shelf is so large, we're going to want to scale some of our output quantities of interest by the area of the shelf. This will make everything into nice dimensionless numbers, rather than on the order of $10{}^{10}$.

area = Constant(assemble(Constant(1.0) * dx(mesh)))

The next step is to write a pair of Python functions that will create the model-data misfit functional and the regularization functional. In the previous demo on inverse problems, we scaled both of these quantities by the area of the domain in order to get nice dimensionless quantities. Here the loss functional isn't an area integral -- it's a sum over discrete points. Instead, we'll divide by the number of observations.

def loss_functional(u):
u_interp = firedrake.interpolate(u[0], Δ)
v_interp = firedrake.interpolate(u[1], Δ)
δu, δv = u_interp - u_o, v_interp - v_o
return 0.5 / Constant(N) * ((δu / σ_x)**2 + (δv / σ_y)**2) * dx

def regularization(θ):
Θ = Constant(1.)
L = Constant(7.5e3)
return 0.5 / area * (L / Θ)**2 * inner(grad(θ), grad(θ)) * dx

Now we'll create create a StatisticsProblem object. To specify the problem, we need to give it a procedure for running the simulation, another procedure for evaluating how good our guess is, and an initial guess for the unknown parameters.

from icepack.statistics import StatisticsProblem, MaximumProbabilityEstimator

problem = StatisticsProblem(
simulation=simulation,
loss_functional=loss_functional,
regularization=regularization,
controls=θ,
)

Now that we've specified the problem, we'll create a MaximumProbabilityEstimator to look for a solution. The runtime is about the same as in the previous demo, so feel free to put on a fresh pot of coffee.

estimator = MaximumProbabilityEstimator(
problem,
step_tolerance=1e-1,
max_iterations=50,
)
θ = estimator.solve()
Truncated CG Trust-Region Solver
iter  value          gnorm          snorm          delta          #fval     #grad     tr_flag   iterCG    flagCG
0     2.019423e+03   3.326136e-02                  6.769483e+04
1     6.884050e+02   2.081603e-02   6.769483e+04   1.692371e+05   3         2         0         2         3
2     6.884050e+02   2.081603e-02   1.692371e+05   1.057732e+04   4         2         2         3         2
3     4.984923e+02   1.514437e-02   1.057732e+04   2.644329e+04   5         3         0         1         3
4     3.077243e+02   8.722083e-03   2.644329e+04   2.644329e+04   6         4         0         1         3
5     1.926535e+02   3.189587e-03   2.644329e+04   6.610823e+04   7         5         0         2         3
6     1.310413e+02   6.092062e-03   6.610823e+04   6.610823e+04   8         6         0         4         3
7     7.390363e+01   2.679055e-03   6.610823e+04   6.610823e+04   9         7         0         5         2
8     6.338102e+01   4.154393e-03   6.610823e+04   6.610823e+04   10        8         0         5         2
9     6.338102e+01   4.154393e-03   6.610823e+04   4.131765e+03   11        8         2         5         2
10    5.436826e+01   1.651255e-03   4.131765e+03   1.032941e+04   12        9         0         1         3
11    4.637153e+01   1.108483e-03   1.032941e+04   2.582353e+04   13        10        0         3         3
12    3.927459e+01   8.078531e-04   2.582353e+04   2.582353e+04   14        11        0         5         3
13    3.401306e+01   8.158500e-04   2.582353e+04   6.455882e+04   15        12        0         6         3
14    3.401306e+01   8.158500e-04   6.455882e+04   4.034926e+03   16        12        2         8         3
15    3.266645e+01   2.846200e-04   4.034926e+03   1.008732e+04   17        13        0         3         3
16    3.150221e+01   4.719721e-04   1.008732e+04   2.521829e+04   18        14        0         4         3
17    2.875062e+01   6.064766e-04   2.521829e+04   2.521829e+04   19        15        0         6         3
18    2.636051e+01   6.742751e-04   2.521829e+04   2.521829e+04   20        16        0         7         3
19    2.535732e+01   8.413788e-04   2.521829e+04   2.521829e+04   21        17        0         9         3
20    2.535732e+01   8.413788e-04   2.521829e+04   1.576143e+03   22        17        2         2         3
21    2.477912e+01   7.789656e-04   1.576143e+03   1.576143e+03   23        18        0         2         3
22    2.422804e+01   2.763071e-04   1.576143e+03   3.940358e+03   24        19        0         2         3
23    2.378412e+01   2.366809e-04   3.940358e+03   3.940358e+03   25        20        0         3         3
24    2.346632e+01   1.825964e-04   3.940358e+03   9.850894e+03   26        21        0         4         3
25    2.304476e+01   5.709335e-04   9.850894e+03   9.850894e+03   27        22        0         6         3
26    2.304476e+01   5.709335e-04   9.850894e+03   6.156809e+02   28        22        2         7         3
27    2.288173e+01   2.599514e-04   6.156809e+02   1.539202e+03   29        23        0         2         3
28    2.274139e+01   1.061313e-04   1.539202e+03   3.848006e+03   30        24        0         3         3
29    2.259491e+01   2.144351e-04   3.848006e+03   9.620014e+03   31        25        0         3         3
30    2.224138e+01   4.463982e-04   9.620014e+03   9.620014e+03   32        26        0         5         3
31    2.224138e+01   4.463982e-04   9.620014e+03   6.012509e+02   33        26        2         5         3
32    2.211099e+01   3.085763e-04   6.012509e+02   1.503127e+03   34        27        0         1         3
33    2.195955e+01   8.618475e-05   1.503127e+03   3.757818e+03   35        28        0         3         3
Optimization Terminated with Status: Converged

As before, the algorithm reduces the objective by two orders of magnitude by the time it's converged. The computed log-fluidity field looks very similar to the one obtained when we matched the computed velocity to the field obtained by interpolating the observations to the mesh.

fig, axes = subplots()
colors = icepack.plot.tripcolor(θ, vmin=-5, vmax=+5, axes=axes)
fig.colorbar(colors);

#### Conclusion¶

The results we obtained from assimilating the point data directly were mostly similar to those we obtained from interpolating them to the finite element mesh. Why go through the exercise at all?

One of the advantages of assimilating sparse point data directly is that we can be more rigorous about the statistical interpretation of our results. If we assume that the observational data are normally distributed about the true velocity values with mean 0 and with standard deviation as reported in the remote sensing product, then the sum of squared errors is a $\chi^2$ random variable with $N - M$ degrees of freedom, where $N$ is the number of observation points and $M$ is the number of degrees of freedom of the parameters we're fitting.

u = simulation(θ)
χ_2 = assemble(loss_functional(u))
print(χ_2)
18.32354351996535

By using the point data directly, we know exactly what $N$ is, which isn't true at all when we're comparing to a velocity that's been interpolated to a mesh that could have a more or less arbitrary and variable resolution. We still have the problem of determining the number $M$ of degrees of freedom, which is a little more involved because of the regularization; we have to read it off from the trace of the posterior covariance matrix. That's outside the scope of this tutorial, but we could also have, for example, used a few eigenfunctions of the Laplace operator as a fixed set of basis functions with no regularization instead. In any case, using the raw data gets us closer to a statistical exactitude. If we can't fit our model to observations as well as we expect, that suggests either that we've mis-characterized the statistical properties of the observations or that our model is incorrect.

The second key advantage was not apparent in this notebook but does show up in other applications. Here we used a gridded data product with fairly dense coverage over the domain of interest. Other data sources are genuinely sparse, for example laser altimetry measurements from ICESat-2, radar sounding from Operation IceBridge flights, or even strain gauge or thermometry data from isolated boreholes.