smallpond.logical.node.DataSinkNode#
- class smallpond.logical.node.DataSinkNode(ctx: Context, input_deps: Tuple[Node, ...], output_path: str, type: Literal['link', 'copy', 'link_or_copy', 'manifest'] = 'link', manifest_only=False, is_final_node=False)#
Collect the output files of input_deps to output_path. Depending on the options, it may create hard links, symbolic links, manifest files, or copy files.
- __init__(ctx: Context, input_deps: Tuple[Node, ...], output_path: str, type: Literal['link', 'copy', 'link_or_copy', 'manifest'] = 'link', manifest_only=False, is_final_node=False) None #
Construct a DataSinkNode. See
Node.__init__()
to find comments on other parameters.Parameters#
- output_path
The absolute path of a customized output folder. If set to None, an output folder would be created under the default output root. Any shared folder that can be accessed by executor and scheduler is allowed although IO performance varies across filesystems.
- type, optional
The operation type of the sink node. “link” (default): If an output file is under the same mount point as output_path, a hard link is created; otherwise a symlink. “copy”: Copies files to the output path. “link_or_copy”: If an output file is under the same mount point as output_path, creates a hard link; otherwise copies the file. “manifest”: Creates a manifest file under output_path. Every line of the manifest file is a path string.
- manifest_only, optional, deprecated
Set type to “manifest”.
Methods
__init__
(ctx, input_deps, output_path[, ...])Construct a DataSinkNode.
add_perf_metrics
(name, value)create_task
(*args, **kwargs)get_perf_stats
(name)slim_copy
()task_factory
(task_builder)Attributes
enable_resource_boost
num_partitions