Configurations
Melusine components can be instantiated using parameters defined in configurations. The from_config
method accepts a config_dict
argument.
from melusine.processors import Normalizer
normalizer_conf = {
"input_columns": ["text"],
"output_columns": ["normalized_text"],
"form": "NFKD",
"lowercase": False,
}
normalizer = Normalizer.from_config(config_dict=normalizer_conf)
Or a config_key
argument.
from melusine.pipeline import MelusinePipeline
pipeline = MelusinePipeline.from_config(config_key="demo_pipeline")
When demo_pipeline
is given as argument, parameters are read from the melusine.config
object at key demo_pipeline
.
Access Configurations
The melusine configurations can be accessed with the config
object.
The configuration of the demo_pipeline
can then be easily inspected.
{
'steps': [
{'class_name': 'Cleaner', 'config_key': 'body_cleaner', 'module': 'melusine.processors'},
{'class_name': 'Cleaner', 'config_key': 'header_cleaner', 'module': 'melusine.processors'},
{'class_name': 'Segmenter', 'config_key': 'segmenter', 'module': 'melusine.processors'},
{'class_name': 'ContentTagger', 'config_key': 'content_tagger', 'module': 'melusine.processors'},
{'class_name': 'TextExtractor', 'config_key': 'text_extractor', 'module': 'melusine.processors'},
{'class_name': 'Normalizer', 'config_key': 'demo_normalizer', 'module': 'melusine.processors'},
{'class_name': 'EmergencyDetector', 'config_key': 'emergency_detector', 'module': 'melusine.detectors'}
]
}
Modify Configurations
The simplest way to modify configurations is to create a new directory directly.
from melusine import config
# Get a dict of the existing conf
new_conf = config.dict()
# Add/Modify a config key
new_conf["my_conf_key"] = "my_conf_value"
# Reset Melusine configurations
config.reset(new_conf)
To deliver code in a production environment, using configuration files should be preferred to modifying the configurations on the fly.
Melusine lets you specify the path to a folder containing yaml
files and loads them (the OmegaConf
package is used behind the scene).
from melusine import config
# Specify the path to a conf folder
conf_path = "path/to/conf/folder"
# Reset Melusine configurations
config.reset(config_path=conf_path)
# >> Using config_path : path/to/conf/folder
When the MELUSINE_CONFIG_DIR
environment variable is set, Melusine loads directly the configurations files located at
the path specified by the environment variable.
import os
from melusine import config
# Specify the MELUSINE_CONFIG_DIR environment variable
os.environ["MELUSINE_CONFIG_DIR"] = "path/to/conf/folder"
# Reset Melusine configurations
config.reset()
# >> Using config_path from env variable MELUSINE_CONFIG_DIR
# >> Using config_path : path/to/conf/folder
Tip
If the MELUSINE_CONFIG_DIR
is set before melusine is imported (e.g., before starting the program), you don't need to call config.reset()
.
Export Configurations
Creating your configuration folder from scratch would be cumbersome. It is advised to export the default configurations and then modify just the files you need.
from melusine import config
# Specify the path a folder (created if it doesn't exist)
conf_path = "path/to/conf/folder"
# Export default configurations to the folder
files_created = config.export_default_config(path=conf_path)
Tip
The export_default_config
returns a list of path to all the files created.