Map

The Map field is used to select a value from a map of key/value pairs. The maps are defined in the maps section of the schema and referenced by name in the Map field. They always refer to externally defined data.

Schema

categories:
  - name: POSTCODE
    file: "data/POSTCODE.csv"
maps:
  - name: POSTCODE_SUBURB_MAP
    file: "data/POSTCODE_SUBURB.csv"
fields:
  - name: Suburb
    type: Map
    args:
      key: Postcode
      from_map: POSTCODE_SUBURB_MAP
    null_probability: 0.5
  - name: Postcode
    type: WeightedCategory
    args:
      from_category: POSTCODE
  - name: Suburb2
    type: Map
    args:
      key: Postcode2
      from_map: POSTCODE_SUBURB_MAP
      default: "N/A"
    constraints:
      - type: IfNull
        name: Suburb
  - name: Postcode2
    type: Bothify
    args:
      format: "####"
# data/POSTCODE.csv
PCODE
6157
6000
6100
6101
6530
PCODE,SUBURB
6157,Palmyra
6000,Perth
6100,Victoria Park
6101,East Victoria Park
6530,Geraldton

Output

SuburbPostcodeSuburb2Postcode2
East Victoria Park61014128
Perth60004642
6000N/A4847
6101N/A9223
Victoria Park61001988

Arguments

NameTypeDescriptionDefault
keystringThe name of the field to use as the key in the map.
from_mapstringThe name of the map to use. This is a reference to a name defined in the maps section. See Maps for more information.

Field arguments

NameTypeDescriptionDefault Value
null_probabilityfloatThe probability that the field will be null.0.0
constraintslistA list of constraints to apply to the field.[]

Supported constraints

NameDescription
IfNullThe value must only be non-null if another field is null.