OpenStreetMap - Tag statistics (Linux, OS X, Windows)

Overview
Tag statistics
OSM query
Site notice

The (extensive) data from the OpenStreetMap project (OSM) is defined by:
- the recorded GPS data objects (node, way, closedway, relation)
- and the description of the data objects (tag, key=value)

To use the OSM data for your own projects (eg topic maps), it is necessary to get an overview of the data (quantity and quality). The utility list_osm_tags.pl should be very helpful. It lists all the (important) tags in form of clear tables. Output example (scheme):

...

********************************************************************************************************************
****************************************  CLOSEDWAY Keys - Summery Section  ****************************************
********************************************************************************************************************

    Uses Primary Keys                    Description
-------- ------------------------------  ------------------------------------------------------------
     ... ...                             ...
     50x aeroway                         Summery
  64912x building                        Summery
   1413x highway                         Summery
  37541x landuse                         Summery
     22x military                        Summery
   4012x natural                         Summery
    244x waterway                        Summery
     ... ...                             ...
-------- ------------------------------  ------------------------------------------------------------
 120178x Primary Keys

********************************************************************************************************************
****************************************  CLOSEDWAY Keys - Details Section  ****************************************
********************************************************************************************************************

...

    Uses AEROWAY                         Other Keys used Together with AEROWAY
-------- ------------------------------  ------------------------------------------------------------
     12x aerodrome ....................  1x aerodrome, 2x alt_name, 2x area, 1x closest_town, 2x ele, 1x fenced,
                                         1x iata, 4x icao, 1x is_in, 1x landuse, 8x name, 1x name:de, 1x name:en,
     15x apron ........................  1x created_by, 1x surface
      5x hangar .......................  5x building
      7x helipad ......................  1x area, 1x description, 1x emergency, 1x highway, 1x surface
      1x model ........................  
      5x runway .......................  1x area, 1x description, 1x name, 1x source, 3x surface, 1x website
      1x taxiway ......................  
      3x terminal .....................  3x building, 3x name
      1x tower ........................  1x building
-------- ------------------------------  ------------------------------------------------------------
     50x aeroway

...

For each data element type an output file is created with a corresponding file identifier:
- *.closedway
- *.node
- *.relation
- *.way

Which keys are included in the output is set when the program is called by the user:
- list of primary keys that must be considered
- list of primary keys to be excluded
- list of secondary keys that must be considered
- list of Secondary Keys to be excluded

At the beginning it is recommended to specify a list of Primary Keys to include. Basically, however, all of the above lists are optional. If they are missing a (very) large result set is generated, which is potentially confusing.

The program is written in Perl (open source) and thus operating system independent. It is compatible with:
- Windows
- Mac OS X
- Linux
On Windows, you may initially have to install a Perl interpreter (eg ActivePerl).

The utility has (basically) no restrictions regarding the amount of data to be processed. It is recommended to start with a "small" amount of data. Processing time examples from an iMac with Core 2 Duo processor with 3 GHz:
- Muenster (administrative district), 2 million elements, <1 minute
- North Rhine-Westphalia (province), 16 million items, 5 minutes
- Germany (state), 84 million items, 26 minutes

If you start the program without parameters, the following terms of use are displayed:

list_osm_tags.pl - List Tag Data / Statistics for OSM-Data-File, Release 1.0.0 (2011/06/09)

Usage:
perl list_osm_tags.pl [Options] -osmdata="file"

Example:
perl list_osm_tags.pl -processPrimaryKeys="M:IncludePrimaryKeys" -osmdata="M:germany.osm"

Argument:
-osmdata              = file containig the osm xml data

Options:
-h | -?               = show help (this)
-lineprefix           = prefix for each line (e.g. "# " for mkgmap style files)
-processPrimaryKeys   = file containing primary keys to include into the results
-ignorePrimaryKeys    = file containing primary keys to exclude from the results
-processSecondaryKeys = file containing secondary keys to include into the results
-ignoreSecondaryKeys  = file containing seccondary keys to exclude from the results

Remark:
- The data representation in all files is UTF-8.

Download: list_osm_tags-100.zip