FILTER

https://img.shields.io/badge/wrapper_version-v9.4.0-10785b https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/phys/root/filter?label=version%20update%20pull%20requests&color=1cb481

filter a TTree using RDataFrame

URL: https://root.cern/doc/master/classROOT_1_1RDataFrame.html

Example

This wrapper can be used in the following way:

rule filter_str:
    input:
        "ntuple0.root",
    output:
        "ntuple0_str_output.root",
    log:
        "logs/filter/filter_str.log",
    params:
        input_tree_name="TestTree",
        output_tree_name="TestTree",
        criteria="(pt > 1400) && (pz > 19000)",
        branches_to_save=["pz", "pt", "p"],
    threads: 2
    wrapper:
        "v9.4.0/phys/root/filter"

rule filter_list:
    input:
        "ntuple0.root",
    output:
        "ntuple0_list_output.root",
    log:
        "logs/filter/filter_list.log",
    params:
        input_tree_name="TestTree",
        output_tree_name="TestTree",
        criteria=["pt > 1400", "pz > 19000"],
        branches_to_save=["pz", "pt", "p"],
        verbose=True
    threads: 2
    wrapper:
        "v9.4.0/phys/root/filter"

rule filter_dict:
    input:
        "ntuple0.root",
    output:
        "ntuple0_dict_output.root",
    log:
        "logs/filter/filter.log",
    params:
        input_tree_name="TestTree",
        output_tree_name="TestTree",
        criteria={
            "PT cut": "pt > 1400",
            "PZ cut": "pz > 19000"
        },
        branches_to_save=["pz", "pt", "p"],
        verbose=True
    threads: 1
    wrapper:
        "v9.4.0/phys/root/filter"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • root=6.38.04

Input/Output

Input:

  • ROOT file

Output:

  • ROOT file

Params

  • input_tree_name: name of the input TTree

  • output_tree_name: name of the output TTree

  • criteria: filtering criteria, e.g. “(abs(Dp_MM - 1968.34) < 50)”.

  • branches_to_save: list of branch names to be saved. If not specified, all branches of the input TTree will be saved. (optional)

Authors

  • Anfeng Li

Code

__author__ = "Anfeng Li, Jamie Gooding"
__copyright__ = "Copyright 2024, Anfeng Li"
__email__ = "anfeng.li@cern.ch, jamie.gooding@cern.ch"
__license__ = "MIT"

from typing import Dict, List
import ROOT

if snakemake.threads > 1:
    ROOT.EnableImplicitMT(snakemake.threads)
else:
    ROOT.DisableImplicitMT()

# Parse criteria
_smk_criteria = snakemake.params.criteria
if isinstance(_smk_criteria, str):
    criteria = [_smk_criteria]
    labels = [_smk_criteria]
elif isinstance(_smk_criteria, List):
    criteria = _smk_criteria
    labels = _smk_criteria
elif isinstance(_smk_criteria, Dict):
    criteria = _smk_criteria.values()
    labels = _smk_criteria.keys()
else:
    raise TypeError("Parameter 'criteria' should be type of 'str', 'list' or 'dict'")

branches_to_save = snakemake.params.get("branches_to_save", None)

df = ROOT.RDataFrame(snakemake.params.input_tree_name, snakemake.input[0])
for criterion, label in zip(criteria, labels):
    df = df.Filter(criterion, label)

if snakemake.params.get("verbose", False):
    report = df.Report()

if branches_to_save is not None:
    df.Snapshot(
        snakemake.params.output_tree_name,
        snakemake.output[0],
        branches_to_save,
    )
else:
    df.Snapshot(snakemake.params.output_tree_name, snakemake.output[0])

if snakemake.params.get("verbose", False):
    report.Print()