When rendering a map using data from the OpenStreetMap (OSM) project the cookbook implementations end up using Mapnik XML to specify the presentation (you may get to the Mapnik XML by passing CartoCSS through a preprocessor). Seems simple enough, for each feature you want to map your XML looks something like:
<Style name="natural_point_names">
<Rule>
<Filter>
([natural] = 'peak') and
(not ([name] = null))
</Filter>
<TextSymbolizer
fontset-name="sans-fonts"
size="6 * 4.16666666667"
fill="black"
justify-alignment="center"
dy="-5 * 4.16666666667"
dx="0"
placement-type="list"
>
[name]
<Placement dy="-6 * 4.16666666667" dx="0" /> <!-- north -->
<Placement dy="0" dx="7 * 4.16666666667" /> <!-- east -->
<Placement dy="2 * 4.16666666667" dx="0" /> <!-- south -->
<Placement dy="0" dx="-7 * 4.16666666667" /> <!-- west -->
</TextSymbolizer>
</Rule>
.
.
.
</Style>
The “[name]” part says take the value of the name tag. The other stuff in the <TextSymbolizer> part specifies the styling. And all the stuff in <Filter> specifies what this whole <Rule> is applied too.
Again, in the cookbook case, OSM data is put into a Postgresql database for use. So to get the name out you need a select query. Something like:
SELECT way, "natural", "place", "tourism", "amenity",
name
FROM planet_osm_point WHERE ("natural" is not null) or ("place" is not null)) as foo
Easy peasy.
You may want to have the elevation shown with the name and an easy way to do that is to make it part of the name that is passed to the styling information. At least I found it easier to append the elevation, if it exists, to the name in the SQL query than to get everything lined up and spaced properly in the XML style. But there are places where the primary name is not the only one that is used. So we should check if there is an alt_name value. In SQL with a little work we can return a synthetic name value of the form:
Name (alt_name) elevation
We can even convert the elevation in meters to feet. Looking pretty good:
But what is this “Los Angeles County (Los Angeles)” bit? That doesn’t look good. The second “Los Angeles” is from the alt_name. Maybe we shouldn’t display the alt_name if it is a leading substring of the name. This looks better:
We will be trekking in Nepal so I’d like to make a offline/paper map of the area around where we will be. But the primary names there may not be in English. So we need to extend our logic to cover the “name:en” and “alt_name:en” tags. But it turns out that in many cases in Nepal the “name” and “name:en” fields have the same value so we need to check for and deal with that. We might be adequately handling the names now:
Oddly, at least to me, the mappers in Kathmandu have decided to put a “Romanized” version of their local names in the name field rather than in their local language. Looking at the trekking areas around Lukla and Namche Bazar, it seems this practice extends beyond the Kathmandu Valley. This makes creating a map with Nepalese and English names impossible without resorting to special casing the logic for Nepal. That is we can’t just look at the name and name:en tags, we need to figure out if the feature is in Nepal and if so look at the name:ne tag instead of or in addition to the name tag.
This does not scale well. Every country or region that uses Romanized (transliterated) names in the name tag will need to be special cased. OSM has no standardized way of specifying a default language code on a national or regional boundary so this will need to be in some sort of separate geographical database. Requiring a separate geographical database to render OSM, itself a geographical database, is not a good thing.
If you are defining mapping standards and practices for your country, please just follow the wiki for normal name tag practice and definitely don’t transliterate the name.
Unfortunately, combining the name, alt_name, name:en, alt_name:en, ref and ele fields into one synthetic “name” is a lot of logic to put into each and every SQL select statement. So put everything into a stored procedure:
-- Internal routine for consistent formatting of names of all types.
CREATE OR REPLACE FUNCTION display_name(
pname text DEFAULT NULL::text, -- value of name field
alt_name text DEFAULT NULL::text, -- value of alt_name field
en_name text DEFAULT NULL::text, -- value of name:en field
alt_en_name text DEFAULT NULL::text, -- value of alt_name:en field
elevation text DEFAULT NULL::text, -- value of ele field
reference text DEFAULT NULL::text) -- value of ref field
RETURNS text AS $$
BEGIN
-- Ignore alt_name in cases like the one where
-- name='Riverside County' and alt_name='Riverside'
-- i.e. where alt_name is a leading substring of name
IF (LEFT(pname,LENGTH(alt_name)) = alt_name) THEN
alt_name = null;
END IF;
-- If no alt_name then use ref as the alternative name
IF (alt_name is null) THEN
alt_name = reference;
END IF;
-- Odd case, but maybe we have an alt_name with no name
-- If we have both, then alt_name in parens after name
IF (pname is null) THEN
pname = alt_name;
ELSE
IF (alt_name IS NOT null) THEN
pname = pname || ' (' || alt_name || ')';
END IF;
END IF;
-- If no name:en but there is a alt_name:en then set name:en to
-- the alt_name:en value
IF (en_name is null) THEN
en_name = alt_en_name;
ELSE
IF (alt_en_name IS NOT null) THEN
en_name = en_name || ' (' || alt_en_name || ')';
END IF;
END IF;
-- Use elevation value for name or alt name as appropriate
IF (elevation::text ~ '^[0123456789\.]+$') THEN
IF (pname is null) THEN
IF (en_name is null) THEN
RETURN round(elevation::real * 3.28084)::text;
ELSE
RETURN en_name || chr(10) || round(elevation::real * 3.28084)::text;
END IF;
ELSE
IF (en_name is null) THEN
RETURN pname || chr(10) || round(elevation::real * 3.28084)::text;
ELSE
IF (pname <> en_name) THEN
RETURN pname || chr(10) || en_name || chr(10) || round(elevation::real * 3.28084)::text;
ELSE
RETURN pname || chr(10) || round(elevation::real * 3.28084)::text;
END IF;
END IF;
END IF;
ELSE
IF (pname is null) THEN
IF (en_name is null) THEN
RETURN null;
ELSE
RETURN en_name;
END IF;
ELSE
IF (en_name is null) THEN
RETURN pname;
ELSE
IF (pname <> en_name) THEN
RETURN pname || chr(10) || en_name;
ELSE
RETURN pname;
END IF;
END IF;
END IF;
END IF;
RETURN 'Huh?';
END; $$
LANGUAGE PLPGSQL;
And you change the SQL select to
SELECT way, "natural", "place", "tourism", "amenity",
display_name(planet_osm_point.name,
planet_osm_point.alt_name,
planet_osm_point."name:en",
planet_osm_point."alt_name:en",
planet_osm_point.ele,
planet_osm_point.ref) as name
FROM planet_osm_point WHERE ("natural" is not null) or ("place" is not null)) as foo
We haven’t even considered the possibility of the other name related tags in OSM:
- International name (int_name)
- Local name (loc_name)
- National name (nat_name)
- Official name (official_name)
- Old name (old_name)
- Regional name (reg_name)
- Short name (short_name)
- And all their international language equivalents.
Ugh.
There is a set of Postgresql extensions to help transliterate names. And maybe it is the way to go. But its focus is on creating a German (or possibly other language) transliteration from languages that use non-Latin scripts. I don’t think it covers all the possibilities of alt_name, loc_name, old_name, using the ref if there is no name, etc.
Edit
It is probably better to gather the various name bits needed from the tag field which is a Postgresql “hstore” (key->value hash) type. This allows the stored function to have fewer parameters:
CREATE OR REPLACE FUNCTION display_name(
pname text DEFAULT NULL::text, -- value of name field
elevation text DEFAULT NULL::text, -- value of ele field
reference text DEFAULT NULL::text, -- value of ref field
tags hstore DEFAULT NULL::hstore) -- all other tags
RETURNS text
And makes the call in each SQL select statement easier:
display_name(name, ele, ref, tags) as name
An advantage of this, in addition to being shorter to code, is that the logic can be more easily revised to use more fields (loc_name, official_name, etc.) without affecting the call parameters. And if automatic transliteration is attempted using something like the German library mentioned above, the tags field needed for that is already being passed.