Skip to content

Prediction

tselea edited this page May 23, 2019 · 1 revision

Usage

Each predict operation includes:

  • pre-processing - actions applied to the input data, which should be consistent with the pre-processing steps applied during training
  • model configuration - specify inputs models to be loaded (locations), there can be any number of models that can be used to create an ensemble prediction It is mandatory to prepare a prediction configuration YAML file, that sets the parameters for the prediction process.

Starting the prediction step:

hugin predict --ensemble-config /path/to/predict_config.yaml --input-dir /path/to/input_dir/ --output-dir /path/to/output_dir/

The --ensemble-config parameter is mandatory and so is the --input_dir. This should be followed by the path to the prediction configuration file.

An example can be found at hugin/etc/usecases/s2-forestry/predict_ftr_terrasigna.yaml.

Options

  • data_source - Mandatory parameter. This is a reference to the Hugin class that loads the input data set for the model. Hugin currently supports disk-based file system loading, loading the files from a specific directory (recursively) that respect a specific name pattern.
    Value: !!python/object/apply:hugin.io.FileSystemLoader

    Additional options for the loader may be specified under kwds parameter.
    • input_source - Mandatory parameter. Path to input training directory to extract training sampling images based on matching the data_pattern regex. The loader recursively iterate through all the inside directories for matching file names.
      Example: /path/to/input_training/folder
    • data_pattern - Mandatory parameter. A regex pattern that must match all the possible image types and ground truth (GT) images that should be included in the sampling. The regex should include named groups (?P<group_name>regex), in order to make the image name pattern easier to identify.
      Example: '(?P[0-9A-Za-z]+)__(?P[A-Za-z0-9_.]+)(?PB.*).tif$'
      Explanation: A match for the pattern is: S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.05_B01_60m.tif
      name = _S2A_OPER_MSI_L1C_TL_SGS
      idx = 20170519T130610_A009957_T34TFQ_N02.05
      type = B01_60m
    • id_format - Mandatory parameter. A pattern to uniquely identify each image in the dataset, regardless of image type. The loader groups the images based on the value of the id_format. It can be correlated with the named groups from data_pattern.
      Example: '{name}-{idx}'
    • type_format - Mandatory parameter. A pattern to uniquely identify each image type in the dataset, for each image id (id_format). It can be correlated with the named groups from data_pattern.
      Example: '{type}'
      Explanation: For the following image names, based on the id_format and type_format the groups are: S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.05_B01_60m.tif
      S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.05_B02_10m.tif
      S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.05_B02_10m_GTI.tif
      S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.03_B01_60m.tif
      S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.03_B01_60m_GTI.tif

- _S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.05:
 { 
   B01_60m: S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.05_B01_60m.tif,
   B02_10m: S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.05_B02_10m.tif,
   B02_10m_GTI: S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.05_B02_10m_GTI.tif
 }
- S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.03:
{
  B01_60m: S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.03_B01_60m.tif,
  B01_60m_GTI: S2A_OPER_MSI_L1C_TL_SGS__20170519T130610_A009957_T34TFQ_N02.03_B01_60m_GTI.tif
} 
  • ensemble - Mandatory parameter. Global configuration for the ensembling method. It is used for merging results of multiple models and/or merging prediction for one pixel from multiple strides. Currently, we support only average.
    Example:
ensemble:
  method: average
  • model - Mandatory parameter. Includes all the models to be used in prediction. At least one model should be mentioned.
    • model_name - Mandatory parameter. Replace model_name with the actual model name. If multiple models are present, it should be unique.
      • path - Mandatory parameter. Full path to the stored pre-trained model weights.
      • builder - Mandatory parameter. Specify the model builder function, in the format module:function_name.
      • type - Mandatory parameter. Model type. Currently Hugin supports two possible options: keras or sklearn.
      • window_size - Mandatory parameter. List with two elements in the format [W, H], specifying the slice window size weight(W)xheight(H) to be cropped from the original image. Useful if the model to train is built for a specific input image size.
      • stride_size - Mandatory parameter. An integer specifying the stride to move the slicing window across the input image.
      • match_i - Mandatory parameter. Regular expression usable for predicting using the current model only on datasets with matching id. Can be used for conditionally applying models.
      • batch_size - Mandatory parameter. Set batch size.
      • swap_axes - Mandatory parameter. Used to swap to (nchannel, dim1, dim2..) convention and viceversa.
      • mapping - Mandatory parameter. Includes a list of selected image types and channels (indexed from 1) for prediction. No direct value.
        • inputs - Mandatory parameter. List of lists in the format [image-type,channel-number], for the input source. The image-type must correspond with one of the identified type_format matching patterns.
          Example:
          -[B02_10m, 1]
          -[B02_10m, 2]
          -[B02_10m, 3]
        • target - Optional parameter. List of lists in the format [image-type,channel-number], for the GT. The image-type, must correspond with one of the idetified type_format matching patterns. The target should be a class vector image, that will be later on transformed into one hot encoded (catagorical vector). Useful for computing specific metrics.
          Example:
          -[B02_10m_GTI]

Clone this wiki locally