Maps

Maps are a way to define a list of things that can be referenced by name in some fields. They are defined in the maps section of the schema.

The contents of the file are expected to be in CSV format. The first row is expected to be a header row by default. The first column is expected to be the key to be used. The second column is expected to be the value to be used.

This allows for interesting use cases such as mapping consistently between randomly selected categories and their corresponding attributes.

Note: There is a small amount of awkwardness here in that you will probably want to select a category from the first column of the map and then later map that category to the value in a separate column.

The limitation we currently have is that to do this you will need to copy the first column from the map file and create a new category file with the same contents.

Schema

The below schema will make the contents of data/POSTCODE_SUBURB.csv available as a map called POSTCODE_SUBURB_MAP.

maps:
  - name: POSTCODE_SUBURB_MAP
    file: "data/POSTCODE_SUBURB.csv"
    header: true

Arguments

NameTypeDescriptionDefault
namestringThe name of the map. This is the name by which it can be referenced in other parts of the schema.
filestringThe path to the file containing the map data.
headerboolWhether the first row of the file is a header row.true

Example field using a map

categories:
  - name: POSTCODE
    file: "data/POSTCODE.csv"
maps:
  - name: POSTCODE_SUBURB_MAP
    file: "data/POSTCODE_SUBURB.csv"
fields:
  - name: Suburb
    type: Map
    args:
      key: Postcode
      from_map: POSTCODE_SUBURB_MAP
  - name: Postcode
    type: WeightedCategory
    args:
      from_category: POSTCODE

Example map file

# data/POSTCODE_SUBURB.csv
POSTCODE,SUBURB
7000,Hobart
6000,Perth
5000,Adelaide
4000,Brisbane
3000,Melbourne
2000,Sydney
1000,Canberra