We’re working on finalizing the 2nd Edition of Algorithms for Optimization. We originally set out to add three new chapters, but in addition to that overhauled most of the book. Its pretty exciting!
These projects are so big, and so long, that you’ll inevitably run into all sorts of new challenges because time has passed, some dependencies are no longer supported, you try to do things , or the MIT Press requirements have changed. That last one is the inspiration for this blog post.
I ended up writing some new tooling to support alttext. Short for alternative text for images, alttext is textual content that can be associated with an image and read out to someone using a screen reader. It wasn’t part of the submission materials when we wrote Algorithms for Optimization v1 and Algorithms for Decision Making, but this time around, MIT Press asked that we supply alttext for every figure in a big spreadsheet. New challenge!
Mykel and I are somewhat different when it comes to being textbook authors. Most authors submit large Word documents with disparate images and let the MIT Press team handle the final text layout. Not us. We provide the final printable PDF.
Our setup is quite nice. It is all under source control, we have a ton of control over how everything looks, and we have everything for the book in one place.
When I saw the ask to supply this additional spreadsheet, I instantly became worried that having a separate sheet could cause problems. That sheet needs to be kept in-sync with the textbook — if any figures are added or removed, we want to make sure they are also added or removed from the sheet. The sheet is also a somewhat inconvenient place to write the alttext. Ideally it would be defined in the LaTeX documents, alongside the figure that it describes. Most importantly, we need to know if we’re missing any alttext.
Storing Tests by Code
We already have some nice technology in our textbook-writing workflow that lets us use the algorithms that we present to the reader to both generate figures and author + execute unit tests.
We present our algorithms using Pythontex and algorithm
environments:
\begin{algorithm}
\begin{juliaverbatim}
diff_forward(f, x; h=1e-9) = (f(x+h) - f(x))/h
diff_central(f, x; h=1e-9) = (f(x+h/2) - f(x-h/2))/h
diff_backward(f, x; h=1e-9) = (f(x) - f(x-h))/h
\end{juliaverbatim}
\caption{...}
\end{algorithm}
The juliaverbatim
blocks get typeset, but they aren’t executed.
We have a script that parses our source files for algorithm
blocks and exports the juliaverbatim
contents into a big Julia source file belonging to an Alg4Opt.jl Julia package. We can then load this package when executing Pythontex blocks that do execute, for generating our figures.
We have had unit testing since the beginning. When we first wrote Algorithms for Optimization, we had the unit tests in a separate directory, written in the test files for the Alg4Opt.jl Julia package we exported to. That worked, but the tests were written in an entirely different place than the methods. Sound like storing alttext somewhere other than the figures?
We ended up defining a no-op LaTeX environment:
\excludecomment{juliatest}
and then add those after every algorithm
block:
\begin{juliatest}
let
for (f,x,∂) in [(x->x, 0.0, 1.0),
(x->x, 1.0, 1.0),
(x->x, 1.0, 1.0),
(x->x^2, 0.0, 0.0),
(x->x^2, 1.0, 2.0),
(x->x^2, -1.0,-2.0)]
@test isapprox(diff_forward(f, x), ∂, atol=1e-6)
@test isapprox(diff_central(f, x), ∂, atol=1e-6)
@test isapprox(diff_backward(f, x), ∂, atol=1e-6)
end
end
\end{juliatest}
We then parse the LaTeX source files in the same way we do for the algorithm
blocks, and export the contents of any juliatest
block as unit tests.
Storing the tests next to the algorithms makes things a lot nicer.
Pulling Alttext
I decided that I could do something very similar for alttext.
I defined a dummy command that like juliatest
does nothing when compiling the book, but lets us put the alttext content into it:
\newcommand{\alttext}[1]{}
We can then use it in the source code to define the alttext alongside the figure:
\caption{
A one-dimensional optimization problem.
Note that the minimum is merely the best in the feasible set---lower points may exist outside the feasible region.
\label{fig:one-d-opt-prob}
\alttext{A line chart with a single undulating curve and an interval
containing a local minimum identified as the feasible set.}
}
I then wrote a script that runs through our source files and finds all figure
and marginfigure
blocks, and searches for such a command. If it finds it — great, we can pull out the alttext content and export it to that spreadsheet we need. If not, we can print out a warning that that figure (whose \label
ID we also extract), is missing alttext. A nice, simple scripted solution.
Expand this to view the full script.
using Printf
mutable struct FigureEntry
file_index::Int # Index into chapter files
line_index_lo::Int # Index into the chapter's lines at which the \begin resides
line_index_hi::Int # Index into the chapter's lines at which the \end resides
label::String # Figure label, as defined by \label command (or empty)
# Figures may not have labels if they are in solutions or examples.
alttext::String # Alt text, as given by an \alttext command (or empty)
# Every figure is expected to have alttext for the final deliverable.
end
function is_start_of_block(str, block)
return startswith(str, "\\begin{$block}")
end
function is_end_of_block(str, block)
return startswith(str, "\\end{$block}")
end
function get_files(; chapter_regex::Regex = r"include\{chapter")
retval = String[]
for line in readlines("optimization-chapter.tex")
if occursin(chapter_regex, line)
m = match(r"chapter/\S*(?=\})", line)
@assert isa(m, RegexMatch)
push!(retval, m.match*".tex")
end
end
return retval
end
function find_matching_paren(str::String, starting_index::Int=something(findfirst(isequal('('), str), 0))
@assert str[starting_index] == '('
nopen = 1
i = starting_index
n = lastindex(str)
while nopen > 0 && i < n
i = nextind(str,i)
nopen += str[i] == '('
nopen -= str[i] == ')'
end
return nopen == 0 ? i : -1
end
"""
Find the text for a label, such as "fig:gradient_descent_rosenbrock" from
\\label{fig:gradient_descent_rosenbrock}
There should only ever be one \\label entry. In the event that there are multiple,
this methods returns the first one.
If no label is found, this method returns an empty string.
"""
function find_label(lines, line_index_lo::Int, line_index_hi::Int)::String
for line in lines[line_index_lo:line_index_hi]
m = match(r"\\label\{([a-zA-Z0-9_:\\-]+)\}", line)
if isa(m, RegexMatch)
return m[1]
end
end
return ""
end
"""
Find the alttext for a figure, which is contained inside an \\alttext{} command.
There should only ever be one \\alttext entry per figure. In the event that there are multiple,
this methods returns the first one.
If no alttext is found, this method returns an empty string.
"""
function find_alttext(lines, line_index_lo::Int, line_index_hi::Int)::String
for line in lines[line_index_lo:line_index_hi]
m = match(r"\\alttext\{([^}]+)\}", line)
if isa(m, RegexMatch)
return m[1]
end
end
return ""
end
function pull_figures()
is_start_of_ignore = str -> is_start_of_block(str, "ignore")
is_start_of_figure = str -> is_start_of_block(str, "figure")
is_start_of_marginfigure = str -> is_start_of_block(str, "marginfigure")
is_start_of_relevant_block = str -> is_start_of_figure(str) || is_start_of_marginfigure(str) || is_start_of_ignore(str)
figures = FigureEntry[]
for (file_index, filepath) in enumerate(get_files())
filename = splitext(splitdir(filepath)[2])[1]
println("\treading ", filename)
lines = [replace(line, "\n"=>"") for line in open(readlines, filepath, "r")]
counter = 0
i = something(findfirst(is_start_of_relevant_block, lines), 0)
while i != 0
block = is_start_of_ignore(lines[i]) ? "ignore" :
is_start_of_figure(lines[i]) ? "figure" :
"marginfigure"
j = findnext(str -> is_end_of_block(str, block), lines, i+1)
if block != "ignore"
label = find_label(lines, i, j)
alttext = find_alttext(lines, i, j)
push!(figures, FigureEntry(file_index, i, j, label, alttext))
end
i = something(findnext(is_start_of_relevant_block, lines, j), 0)
end
end
return figures
end
# Find all figure and marginfigure blocks
println("Pulling all figures")
figures = pull_figures()
for (i_figure, figure) in enumerate(figures)
@printf "%3d %2d [%04d:%04d] %s\n" i_figure figure.file_index figure.line_index_lo figure.line_index_hi figure.label
println(" $(figure.alttext)")
end
n_figures_missing_alttext = sum(fig.alttext == "" for fig in figures)
if n_figures_missing_alttext > 0
println("MISSING ALT TEXT!")
files = get_files()
for (i_figure, figure) in enumerate(figures)
label_text = figure.label
if label_text == ""
label_text = "UNLABELED"
end
@printf "%2d %s in %s\n" i_figure figure.label files[figure.file_index]
end
end
println("")
println("$(length(figures) - n_figures_missing_alttext) / $(length(figures)) figures have labels")
println("Good job!")
The Joy of Coding your own Tools
That’s what this blog post is really about. The fact that you can dig in and code your own solution. We spend so much time coding for big company projects, that it is easy to forget that we can code small, useful things for ourselves.
The coding we did here is not particularly clever, nor particularly difficult, nor particularly large. That isn’t the point. The point is that we had a problem, and we were able to solve it ourselves with software. Our tools of the trade were brought to bear on our own problem.
I don’t often use coding to solve my own problems, but it does happen every so often. I used coding to create placecards for my wedding, for example, and to create the wedding website. I’ve written code to generate .svg files for CNC laser cutters, in order to craft a loved one a nice birthday present. In high school, I wrote a basic notecard program for practicing my French vocab. That one was super useful.
I am a big fan of Casey Muratori of Handmade Hero (and Computer, Enhance!), which gave rise to the handmade movement. The ideas there are very similar — there is joy to be had from building things yourself, and you are smart enough to dive into something and learn how it works.
Anyhow, I think its nice to be reminded of all this from time to time. Happy coding!