-
Notifications
You must be signed in to change notification settings - Fork 45
Triggers
Scale uses triggers for automatically generating jobs and recipes to execute as new source data enters the system. Rules are configured and when a (trigger) event occurs in Scale that matches an existing trigger rule, the job(s) and/or recipe(s) for the rule are created and placed on the queue. A given trigger event can trigger multiple rules. There are three different types of Scale triggers: ingest triggers, parse triggers, and clock triggers.
Ingest triggers are triggers that can occur when a source file is ingested into Scale. A trigger event is generated for every file ingest and checked against all ingest trigger rules.
Example ingest trigger configuration:
{
"version": "1.0",
"condition": {
"media_type": "text/plain",
"data_types": [
"foo",
"bar"
]
},
"data": {
"input_data_name": "my_file",
"workspace_name": "my_workspace"
}
}The condition field is used to define the conditions for when the ingest rule is triggered. The media_type field says that an ingested file must have a media type of text/plain (a plain text file) in order to trigger this rule. The data_types field specifies that the ingested file must also have the data types “foo” and “bar” tagged on it in order to trigger the rule. The data field specifies the information needed to create the applicable job/recipe (whatever the trigger rule is linked to) when the rule is triggered. The input_data_name field defines the input parameter name of the job/recipe that the ingested file should be passed to, and the workspace_name field gives the unique system name of the workspace for storing all of the products generated by the created job/recipe. To see all of the options for an ingest trigger rule’s configuration, please refer to the Ingest Trigger Configuration Specification below.
A valid ingest trigger rule configuration is a JSON document with the following structure:
{
"version": "1.0",
"condition": {
"media_type": STRING,
"data_types": [
STRING,
STRING
]
},
"data": {
"input_data_name": STRING,
"workspace_name": STRING
}
}Type: String
Required: No
Defines the version of the configuration used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an olderversionand convert it to the current version. The default value, if not included, is the latestversion(currently1.0). It is recommended, though not required, that you include theversionso that future changes to the specification will still accept your ingest trigger rule configuration.
Type: JSON object
Required: No
Contains other fields that specify the conditions under which this ingest rule is triggered. If not provided, the rule is triggered by EVERY source file ingest.
-
Type: String
Required: No
Defines a media type. An ingested file must have the identical media type defined here in order to trigger this rule. If not provided, the field defaults to“”and all file media types are accepted by the rule. -
Type: Array
Required: No
A list of data type strings. An ingested file must have all of the data types that are listed here tagged to the file in order to trigger this rule. If not provided, the field defaults to[]and no data types are required.
Type: JSON object
Required: Yes
Contains other fields that specify the details for creating the job/recipe linked to this trigger rule.
-
Type: String
Required: Yes
Specifies the input parameter name of the triggered job/recipe that the ingested file should be passed to when the job/recipe is created and placed on the queue. -
Type: String
Required: Yes
Contains the unique system name of the workspace that should store the products created by the triggered job/recipe.
Parse triggers are triggers that can occur when a source file is parsed. This happens when a job completes with a parse_results section in its generated results manifest file, see Results Manifest. A trigger event is generated for every source file parse and checked against all parse trigger rules.
Example parse trigger configuration:
{
"version": "1.0",
"condition": {
"media_type": "text/plain",
"data_types": [
"foo",
"bar"
]
},
"data": {
"input_data_name": "my_file",
"workspace_name": "my_workspace"
}
}The condition field is used to define the conditions for when the parse rule is triggered. The media_type field says that a parsed file must have a media type of text/plain (a plain text file) in order to trigger this rule. The data_types field specifies that the parsed file must also have the data types “foo” and “bar” tagged on it in order to trigger the rule. The data field specifies the information needed to create the applicable job/recipe (whatever the trigger rule is linked to) when the rule is triggered. The input_data_name field defines the input parameter name of the job/recipe that the parsed file should be passed to, and the workspace_name field gives the unique system name of the workspace for storing all of the products generated by the created job/recipe. To see all of the options for a parse trigger rule’s configuration, please refer to the Parse Trigger Configuration Specification below.
A valid parse trigger rule configuration is a JSON document with the following structure:
{
"version": "1.0",
"condition": {
"media_type": STRING,
"data_types": [
STRING,
STRING
]
},
"data": {
"input_data_name": STRING,
"workspace_name": STRING
}
}Type: String
Required: No
Defines the version of the configuration used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an olderversionand convert it to the current version. The default value, if not included, is the latestversion(currently1.0). It is recommended, though not required, that you include theversionso that future changes to the specification will still accept your parse trigger rule configuration.
Type: JSON object
Required: No
Contains other fields that specify the conditions under which this parse rule is triggered. If not provided, the rule is triggered by EVERY source file parse.
-
Type: String
Required: No
Defines a media type. A parsed file must have the identical media type defined here in order to trigger this rule. If not provided, the field defaults to“”and all file media types are accepted by the rule. -
Type: Array
Required: No
A list of data type strings. A parsed file must have all of the data types that are listed here tagged to the file in order to trigger this rule. If not provided, the field defaults to[]and no data types are required.
Type: JSON object
Required: Yes
Contains other fields that specify the details for creating the job/recipe linked to this trigger rule.
-
Type: String
Required: Yes
Specifies the input parameter name of the triggered job/recipe that the parsed file should be passed to when the job/recipe is created and placed on the queue. -
Type: String
Required: Yes
Contains the unique system name of the workspace that should store the products created by the triggered job/recipe.
Clock triggers are triggers that can occur on a pre-defined schedule. This happens when a the Scale Clock process fires every minute and looks at what clock trigger rules are due to be executed. A trigger event is generated for every clock tick that exceeds the threshold specified by a clock trigger rule. Each clock rule uses its own custom trigger event that is defined by the specification outlined below. Clock rules are useful for general system maintenance that cannot be associated to a normal event like file parsing. Calculating system metrics/performance or archiving old records are good cases for a clock rule.
Example clock trigger configuration:
{
"version": "1.0",
"event_type": "MY_METRICS",
"schedule": "PT1H0M0S"
}The event_type field determines the type of event that is triggered and when determining the last time an event was triggered for the rule. The schedule field determines how often the event should be triggered. The schedule value uses the ISO-8601 period format and is interpreted as absolute time within each day. Therefore, in the example above we are specifying the trigger should happen every hour on the hour. If an event is triggered a few minutes after the hour, the next event will still attempt to fire at the top of the next hour, rather than exactly one hour after the previous event in relative time. This makes the system more predictable and avoids events slowly drifting over time.
Also note that the name field of the trigger rule model must match a corresponding clock event processor registration in the clock module. The processor registration determines what function the Scale clock will execute when the rule is due to trigger a new event.
A valid clock trigger rule configuration is a JSON document with the following structure:
{
"version": "1.0",
"event_type": STRING,
"schedule": STRING
}Type: String
Required: No
Defines the version of the configuration used. This allows updates to be made to the specification while maintaining backwards compatibility by allowing Scale to recognize an olderversionand convert it to the current version. The default value, if not included, is the latestversion(currently1.0). It is recommended, though not required, that you include theversionso that future changes to the specification will still accept your clock trigger rule configuration.
Type: String
Required: Yes
Determines the trigger event associated with the rule. When the clock process checks to see if a rule needs to be triggered it will query for associated events using this type. If the clock determines that the rule does in fact need to trigger, then this type is used to create the new event that is passed to the clock processor function to do the actual work.
Type: String
Required: Yes
Limitation: The current Scale clock implementation does not support the optional days portion of the ISO-8601 period standard and the smallest time slice that it can execute is once every minute.
Specifies how often the rule should be triggered. The value must follow the ISO-8601 period format, which takes the form of hours, minutes, and seconds to trigger an event. It is important to note that the scheduler interprets the period relative to the start of each day, rather than relative to its last triggered event. That way if a schedule is defined for every hour and one of the executions falls behind by a few minutes, the next event will still attempt to trigger as close to the hour as possible. For example, if we request execution every hour usingPT1H0M0Sand the last event actually runs at11:07AM, then the next execution will be attempted at12:00PMeven though that is not a full hour later.
- Home
- What's New
-
In-depth Topics
- Enable Scale to run CUDA GPU optimized algorithms
- Enable Scale to store secrets securely
- Test Scale's scan capability on the fly
- Test Scale's workspace broker capability on the fly
- Scale Performance Metrics
- Private docker repository configuration
- Setting up Automated Snapshots for Elasticsearch
- Setting up Cluster Monitoring
- Developer Notes