Configurations
Melusine components can be instantiated using parameters defined in configurations. The from_config method accepts a config_dict argument.
from melusine.processors import Normalizer
normalizer_conf = {
"input_columns": ["text"],
"output_columns": ["normalized_text"],
"form": "NFKD",
"lowercase": False,
}
normalizer = Normalizer.from_config(config_dict=normalizer_conf)
Or a config_key argument.
from melusine.pipeline import MelusinePipeline
pipeline = MelusinePipeline.from_config(config_key="demo_pipeline")
When demo_pipeline is given as argument, parameters are read from the melusine.config object at key demo_pipeline.
Access Configurations
The melusine configurations can be accessed with the config object.
The configuration of the demo_pipeline can then be easily inspected.
{
'steps': [
{'class_name': 'Cleaner', 'config_key': 'body_cleaner', 'module': 'melusine.processors'},
{'class_name': 'Cleaner', 'config_key': 'header_cleaner', 'module': 'melusine.processors'},
{'class_name': 'Segmenter', 'config_key': 'segmenter', 'module': 'melusine.processors'},
{'class_name': 'ContentTagger', 'config_key': 'content_tagger', 'module': 'melusine.processors'},
{'class_name': 'TextExtractor', 'config_key': 'text_extractor', 'module': 'melusine.processors'},
{'class_name': 'Normalizer', 'config_key': 'demo_normalizer', 'module': 'melusine.processors'},
{'class_name': 'EmergencyDetector', 'config_key': 'emergency_detector', 'module': 'melusine.detectors'}
]
}
Modify Configurations
The simplest way to modify configurations is to create a new directory directly.
from melusine import config
# Get a dict of the existing conf
new_conf = config.dict()
# Add/Modify a config key
new_conf["my_conf_key"] = "my_conf_value"
# Reset Melusine configurations
config.reset(new_conf)
To deliver code in a production environment, using configuration files should be preferred to modifying the configurations on the fly.
Melusine lets you specify the path to a folder containing yaml files and loads them (the OmegaConf package is used behind the scene).
from melusine import config
# Specify the path to a conf folder
conf_path = "path/to/conf/folder"
# Reset Melusine configurations
config.reset(config_path=conf_path)
# >> Using config_path : path/to/conf/folder
When the MELUSINE_CONFIG_DIR environment variable is set, Melusine loads directly the configurations files located at
the path specified by the environment variable.
import os
from melusine import config
# Specify the MELUSINE_CONFIG_DIR environment variable
os.environ["MELUSINE_CONFIG_DIR"] = "path/to/conf/folder"
# Reset Melusine configurations
config.reset()
# >> Using config_path from env variable MELUSINE_CONFIG_DIR
# >> Using config_path : path/to/conf/folder
Tip
If the MELUSINE_CONFIG_DIR is set before melusine is imported (e.g., before starting the program), you don't need to call config.reset().
Export Configurations
Creating your configuration folder from scratch would be cumbersome. It is advised to export the default configurations and then modify just the files you need.
from melusine import config
# Specify the path a folder (created if it doesn't exist)
conf_path = "path/to/conf/folder"
# Export default configurations to the folder
files_created = config.export_default_config(path=conf_path)
Tip
The export_default_config returns a list of path to all the files created.