Music Metadata

Overview

Commonly supported music metadata tags are generally designed and used for popular music and are not a good fit for genres that have more than one track (“song”) per performance (symphony, Broadway musical, etc.), more than one credited composer, or more than one individual in the performance.

Working in the constraints of a mixed Apple MacOS iTunes and Android environment, I have some thoughts on how available metadata tags can be used.

Genre

While there appear to be conventions, there does not appear to be any official standard on genre naming. The wide variety of music found under the fairly common genres names of “Country & Folk”, “Jazz” or even “Classical” really needs to be further divided. I am not consistent (I refuse to put a Broadway musical into the more common “soundtrack” category) but I have decided to generally make genre names hierarchical in format with a “-” separating the levels (e.g. “Country & Folk – Bluegrass”).

Album

There are two schools of thought I’ve seen searching the Internet on how to deal with albums:

  • Keep the original LP or CD album name and structure intact. One advantage of this is that metadata for the album can be found on the Internet. For albums that have more than one performance (e.g. a CD with trumpet concertos by both Hummel and Haydn) use a “grouping” tag to mark the separate performances. Unfortunately, Android’s Media Service does not currently support a grouping tag nor a suitable substitute for it.
  • Break the CD into “synthetic” albums, each performance becoming a separate album. This is supported by most platforms, specifically Android, and is what I decided to do. For classical music I have decided to name these performance related albums using the composer, a colon and then the performance name. (e.g. “Beethoven: Piano Concerto #3 In C Minor, Op. 37”). This aids in sorting and searching as composer is not a field one can easily search or sort on in iTunes.

Song Title

Track titles from the original CDs vary in format quite a bit. Some include the full performance name in addition to the movement or song. (e.g. “Old Friends (from Merrily We Roll Along)” or “Bach (CPE): Sinfonia #1 In G, H 657 1. Allegro Di Molto”). I have decided to remove album/performance names from the individual tracks (e.g. “Old Friends (from Merrily We Roll Along)” becomes “Old Friends” and “Bach (CPE): Sinfonia #1 In G, H 657 1. Allegro Di Molto”) becomes “1. Allegro Di Molto”).

Classical music often has a movement number preceeding the name. I have decided to retain that for display purposes even though the information is redundant with the information contained in the “track” tag.

Composer

As downloaded from the Internet or found embedded on post mid-1990’s CDs, there is a wide variety of conventions for composer information, especially if more than one individual is being credited.

For classical music, I have decided to separate individuals with a semicolon and to have each name in the form “surname, name” as in a sorted list I would rather look under “B” for Bach rather than “C” (Carl Philipp Emanuel Bach) or “J” (Johann Sebastian Bach).

For other genres, I generally follow the same convention as for classical music but am not quite as consistent. Still, you are more likely to see “Kern, Jerome David” than “Jerome Kern” in my library. Part of the inconsistency is that iTunes does not have a quick way to view or sort music by composer so I am less likely to be thorough. (edit: Found that you can add a composer field to the list display of iTunes then sort by that column to find all the composers.)

With respect to album naming, if there are multiple composers in my collection with the same surname, I try to distinguish that in the album name. (e.g. “Bach (CPE): Sinfonia #5 In B Minor, H 661”).

Artist

On the Internet, metadata for artists seems to exclusively be of the form “name surname”. I’ve decided to follow that convention even though it is at odds with my choice for the use of the composer field. I still use a semicolon to separate multiple individuals or named groups. And I generally order the artists by soloist, conductor, group (e.g. “Radu Lupu; Zubin Mehta; Israel Philharmonic Orchestra”).

Album Artist

I don’t find much use for this tag, especially as most of my collection has been broken into a single composition/performance per album. This means that the track/song artist is nearly always identical to the album artist.

Android Media Service

Here is a summary, taken from current source code, of the information that Android keeps about each music file:

Field/Tag Type Interface Description
_count INTEGER BaseColumns The count of rows in a directory.
_data TEXT MediaColumns Path to the file on disk
_display_name TEXT MediaColumns The display name of the file
_ID INTEGER BaseColumns The unique ID for a row.
_size INTEGER MediaColumns The size of the file in bytes
album_artist TEXT AudioColumns The artist credited for the album that contains the audio file
album_id INTEGER (long) AudioColumns The id of the album the audio file is from, if any
album_key TEXT AudioColumns A non human readable key calculated from the ALBUM, used for searching, sorting and grouping
album TEXT AudioColumns The album the audio file is from, if any
artist_id INTEGER (long) AudioColumns The id of the artist who created the audio file, if any
artist_key TEXT AudioColumns A non human readable key calculated from the ARTIST, used for searching, sorting and grouping
artist TEXT AudioColumns The artist who created the audio file, if any
bookmark INTEGER (long) AudioColumns The position, in ms, playback was at when playback for this file was last stopped.
compilation TEXT AudioColumns Whether the song is part of a compilation
composer TEXT AudioColumns The composer of the audio file, if any
date_added INTEGER MediaColumns The time the file was added to the media provider. Units are seconds since 1970.
date_modified INTEGER MediaColumns The time the file was last modified. Units are seconds since 1970.
duration INTEGER (long) AudioColumns The duration of the audio file, in ms
genre TEXT AudioColumns The genre of the audio file, if any. Does not exist in the database – only used by the media scanner for inserts.
height MediaColumns The height of the image/video in pixels.
is_alarm INTEGER (boolean) AudioColumns Non-zero if the audio file may be an alarm
is_drm INTEGER (boolean) MediaColumns Non-zero if the media file is drm-protected
is_music INTEGER (boolean) AudioColumns Non-zero if the audio file is music
is_notification INTEGER (boolean) AudioColumns Non-zero if the audio file may be a notification sound
is_podcast INTEGER (boolean) AudioColumns Non-zero if the audio file is a podcast
is_ringtone INTEGER (boolean) AudioColumns Non-zero if the audio file may be a ringtone
mime_type TEXT MediaColumns The MIME type of the file
title_key TEXT AudioColumns A non human readable key calculated from the TITLE, used for searching, sorting and grouping
title TEXT MediaColumns The title of the content
track INTEGER AudioColumns The track number of this song on the album, if any. This number encodes both the track number and the disc number. For multi-disc sets, this number will be 1xxx for tracks on the first disc, 2xxx for tracks on the second disc, etc.
width MediaColumns The width of the image/video in pixels.
year INTEGER AudioColumns The year the audio file was recorded, if any