Geo-Referenced PDFs

TL;DR

Sometime between September 2021 and January 2022 Avenza Maps changed the processing of side loaded maps in a way that broke my work flow. The open source tool qpdf fixes whatever is wrong.

Backstory

Between 2010 and 2014 I created a new trail map for a backcountry ski patrol. To make the map, I looked around and discovered the OpenStreetMap project and the tool ecosystem that surrounded it and decided I could base the new trail map off of that project. I contributed the public ski trail and route information to the OpenStreetMap project so it is available to other ski related map projects like Open Snow Map. And I set up a small local database for some non-public information, then created a set of scripts and programs to generate printable trail maps.

The result was a set of maps of various sizes and content based on the target audience (public trail map, member’s version map with UTM grid and information useful for search and rescue (SAR) operations, etc.). Each “product” was a PDF which if printed at 100% was scaled correctly for its intended use.

When going over some items with the US Forest Service District Ranger I was asked if that same map could be used in Avenza Maps. I said yes without knowing anything about Avenza Maps.

Then I discovered that for Avenza Maps the PDF needs to be geo-referenced, that is additional metadata added to the PDF to describe the mapping of the PDF paper image to real world coordinates. So I had to figure out how to geo-reference the PDFs I was creating.

There are two schemes to geo-reference a PDF, the original “GeoPDF®” which I could find specifications for. And Adobe’s later extensions using the “viewport” feature added in PDF version 1.6. Since I could find the specifications for GeoPDF and I had the source to the PDF generation library I was using (effectively MIT licensed), I implemented the GeoPDF extensions.

With an updated tool chain, my maps loaded properly in Avenza Maps and life was good.

Now that I had a way to generate maps for Avenza I decided to create and publish my own hiking maps. There have been some learning experiences along the way but in general things have gone fairly smoothly for a lot of years.

Smoothly that is, until this January.

When generating new maps I “side load” them onto Avenza Maps on my iPhone and verify they load correctly, are properly geo-referenced, have no glaringly obvious rendering issues, etc. Then I upload them to the Avenza Maps store and delete the test copies from my phone. That worked in September of 2021. But in January 2022 all I got when I loaded a test map on to the iPhone was the message “1 map(s) failed to process”.

Not a very helpful error message.

No hint as to why it failed. No way to turn on additional logging or debugging.

With respect to the tool chain, nothing on my side had changed so this had to be because of a change on Avenza’s side.

Initial Debugging

I poked and prodded the Avenza Maps app feeding it various PDF files. And I found another iPhone app that uses geo-referenced PDFs which gave more robust error messages on things it did not like.

Eventually I noticed that PDFs that used Adobe’s viewport implementation of geo-referencing seemed to work while PDFs that used GeoPDF did not. But I did not have a copy of the specifications for the Adobe implementation and I did not have the time to work on changing the PDF generation library so my map update was put on the back burner.

Ultimately, it turns out that the method of geo-referencing a PDF was not the issue. But in January that was my erroneous impression.

April 2022

Revising my PDF generation library

Finally, this April I decided to dig into this problem. I still don’t have a copy of the Adobe geo-reference extension specifications (all links that look like they should point to the specs lead to a generic developer page on the Adobe website) but I had enough sample PDFs from various sources that was able to reverse engineer things. In fairly short order I had my PDF generation library setup so I could specify either of the two methods of geo-referencing.

And the PDFs generated with the modified library loaded properly on the other iPhone app. But they still failed on Avenza Maps.

More Experimentation

At this point I had PDFs that were acceptable to the gdal library and to the other iPhone app. And they could be opened for viewing on every laptop app I had that can display PDFs. But they still would not load into Avenza Maps.

It dawned on me that Adobe probably had tools to examine and fix issues in PDFs and my spouse had a copy of Acrobat Pro on her old machine. Loading both my older maps and maps created with my revised tool chain showed they all rendered with no errors and were properly geo-referenced when viewed on Acrobat Pro.

While playing around with this, I found that if I opened my map PDFs in Acrobat Pro then did a “Save As” into an “optimized” PDF they would load properly into Avenza Maps so I had discovered a work around.

Note that both the older GeoPDF based and newer viewport based PDFs both worked properly if “optimized” by Acrobat Pro. The issue was not that Avenza had dropped support for GeoPDFs. The issue was that Avenza just did not like something somewhere in the PDF. But the single “1 map(s) failed to process” message was not giving any clues.

I spent some time re-reading the PDF 1.7 specifications and noticed a few possible issues which I corrected but these fixes made no difference: Everything but Avenza was happy with the maps and Avenza was only willing to give me an uninformative fail message.

I really did not relish having to add a step of copying maps to another computer, manually processing them in Adobe Acrobat Pro, then copying them back to my map generation computer. I wanted to find some way to generate usable PDFs without additional manual steps.

More Research

Time to find tools, preferably open source tools, that could tell me what was wrong with my PDFs. Eventually I found almost what I was looking for, a blog post titled “PDF processing and analysis with open-source tools”.

Amazingly, nearly every tool mentioned was already installed on my computer as part of the standard distribution. A couple of the tools listed found errors in my PDFs which I fixed but which made no difference: My files still failed to load in Avenza Maps but would load successfully on everything else I tried them on.

Success?

The qpdf tool mentioned in that blog post has a function called “linearize” which has similar functionality to the “export as optimized” on Adobe Acrobat Pro. And it apparently fixes the PDFs, perhaps in the same way, as the resulting file is acceptable to Avenza Maps.

It is easy to add one more call to my generation scripts to post process my map PDFs with qpdf so that is what I did. The qpdf program executes in a couple of seconds per map, it is integrated into my build system, and I don’t have to transfer files to and from another computer to “fix” them. Sanity is restored!

I Still Don’t Know What Is Wrong!

My big issue with this “solution” is that I do not have a clue about the underlying problem. What is it about the PDFs generated by my custom library that Avenza Maps app does not like?

I know that if they have been linearized (optimized for web viewing) they will load correctly. And I know that prior to linearization the PDFs pass all the validation tests I could find for them.