Feat: Add parameters to pre_scan
@tim.schoof Metadata stream brings parameters to the worker function pre_scan. In this MR there parameters can be described in nested class Parameters. If this class exists, parameters are compared with information coming from Metadata stream.
- If Parameters are not described in the worker class, nothing changed compared to what was before
- If parameters are described, they are compared with information from Metadata stream.
- If Some of described parameters are missing, error if raised.
- If type of described parameters is standard and not coincides with one from metadata stream, error is raised.
- If type of described parameter is not standard (can not be derived from json). Variable from metadata stream is casted to this type.
- If Metadata have variables, which are not described in the Parameters of worker error is not raised. These variables may be used in the next worker of the pipeline.
Only some standard classes can be de-serialized from json of the metadata stream.
Merge request reports
Activity
195 195 d = d[k] 196 196 d[keys[-1]] = value 197 197 return options 198 199 200 def attrs_to_dict(attrs): 201 attrs_dict = {} 202 for attribute in attrs: 203 config_entry = attribute.metadata[CONFIGURABLE_CONFIG_ENTRY] 204 attrs_dict[attribute.name] = { 205 'type': attribute.type, 206 'help': config_entry.description 207 } 208 return attrs_dict I guess the purpose of this conversion to dict is to have a unified access to fields of the
Attribute
itself and the_ConfigEntry
extension saved in the metadata field. However, this manual approach of copying single fields doesn't scale very well and a dict is not the nice interface for an object which fixed fields/keys. Better approaches would be to write- a function
get_config_entry_option(attribute, name)
that first tries to accessattribute.metadata[CONFIGURABLE_CONFIG_ENTRY].name
and if that fails falls back toattribute.name
- write a wrapper class which does the same, e.g.,
config_entry = ConfigWrapper(attribute); print(config_entry.type, config_entry.help)
- a wrapper class, which makes a clearer separation, e.g.,
config_wrapper = ConfigWrapper(attribute); print(config_wrapper.attribute.type, config_wrapper.config_entry.help)
- a function
changed this line in version 3 of the diff
257 259 return 258 260 259 261 log.info("Performing pre-scan setup") 260 self.worker.pre_scan(data, substream_metadata) 262 parameters = self._meta_to_parameters(substream_metadata['meta']) 263 self.worker.pre_scan(data, substream_metadata, **parameters) changed this line in version 3 of the diff
270 273 finally: 271 274 self._shutdown() 272 275 276 def _meta_to_parameters(self, metadata): 277 parameters = {} 278 if hasattr(self.worker, "Parameters"): 279 parameters = attr.asdict(create_instance_from_configurable(self.worker.Parameters, metadata)) 280 281 # check parameters type This should be a function
check_types
probably in the configurable module.An alternative approach would be to check the types either with the help of https://www.attrs.org/en/stable/api.html#attr.validators.instance_of or in
create_instance_from_configurable
. This would stop construction as soon as the type error occurs and not execute code that expects the correct type in__attrs_post_init__
. I am not sure it is worth the effort at the moment.changed this line in version 3 of the diff
43 43 return flattened_entries 44 44 45 45 46 def get_config_entry_option(attribute, name): 47 value = None 48 if hasattr(attribute.metadata[CONFIGURABLE_CONFIG_ENTRY], name): 49 value = getattr(attribute.metadata[CONFIGURABLE_CONFIG_ENTRY], name) 50 else: 51 value = getattr(attribute, name) 52 return value 53 54 55 def check_type(parameter_class, parameters): changed this line in version 6 of the diff
- Resolved by Mikhail Karnevskiy
44 44 45 45 46 def get_config_entry_option(attribute, name): 47 value = None 48 if hasattr(attribute.metadata[CONFIGURABLE_CONFIG_ENTRY], name): 49 value = getattr(attribute.metadata[CONFIGURABLE_CONFIG_ENTRY], name) 50 else: 51 value = getattr(attribute, name) 52 return value 53 54 55 def check_type(parameter_class, parameters): 56 for attribute in attr.fields(parameter_class): 57 expected_type = get_config_entry_option(attribute, 'type') 58 variable = getattr(parameters, attribute.name) 59 if isinstance(variable, (list, str, dict, float, bool, int)): This follows the table of json conversion: https://docs.python.org/3/library/json.html#json-to-py-table
But probably we want to support more, e.g., the
Path
type for filenames:In [15]: @Configurable ...: class Test: ...: file = Config("A file", type=Path, converter=Path) ...: ...: In [16]: Test("foo.txt") Out[16]: Test(file=PosixPath('foo.txt'))
Here, I would keep things general and only check if the actual type of each attribute is a subtype of the declared type.
changed this line in version 6 of the diff
46 def get_config_entry_option(attribute, name): 47 value = None 48 if hasattr(attribute.metadata[CONFIGURABLE_CONFIG_ENTRY], name): 49 value = getattr(attribute.metadata[CONFIGURABLE_CONFIG_ENTRY], name) 50 else: 51 value = getattr(attribute, name) 52 return value 53 54 55 def check_type(parameter_class, parameters): 56 for attribute in attr.fields(parameter_class): 57 expected_type = get_config_entry_option(attribute, 'type') 58 variable = getattr(parameters, attribute.name) 59 if isinstance(variable, (list, str, dict, float, bool, int)): 60 if expected_type != type(variable): 61 raise TypeError(f"Variable {attribute.name} in metadata stream have wrong type. " changed this line in version 5 of the diff
- Resolved by Tim Schoof
195 195 d = d[k] 196 196 d[keys[-1]] = value 197 197 return options 198 199 200 def attrs_to_dict(attrs): changed this line in version 7 of the diff