FILTER
filter a TTree using RDataFrame
URL: https://root.cern/doc/master/classROOT_1_1RDataFrame.html
Example
This wrapper can be used in the following way:
rule filter_str:
input:
"ntuple0.root",
output:
"ntuple0_str_output.root",
log:
"logs/filter/filter_str.log",
params:
input_tree_name="TestTree",
output_tree_name="TestTree",
criteria="(pt > 1400) && (pz > 19000)",
branches_to_save=["pz", "pt", "p"],
threads: 2
wrapper:
"v7.7.0/phys/root/filter"
rule filter_list:
input:
"ntuple0.root",
output:
"ntuple0_list_output.root",
log:
"logs/filter/filter_list.log",
params:
input_tree_name="TestTree",
output_tree_name="TestTree",
criteria=["pt > 1400", "pz > 19000"],
branches_to_save=["pz", "pt", "p"],
verbose=True
threads: 2
wrapper:
"v7.7.0/phys/root/filter"
rule filter_dict:
input:
"ntuple0.root",
output:
"ntuple0_dict_output.root",
log:
"logs/filter/filter.log",
params:
input_tree_name="TestTree",
output_tree_name="TestTree",
criteria={
"PT cut": "pt > 1400",
"PZ cut": "pz > 19000"
},
branches_to_save=["pz", "pt", "p"],
verbose=True
threads: 1
wrapper:
"v7.7.0/phys/root/filter"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Software dependencies
ROOT=6.30.4
Input/Output
Input:
ROOT file
Output:
ROOT file
Params
input_tree_name: name of the input TTreeoutput_tree_name: name of the output TTreecriteria: filtering criteria, e.g. “(abs(Dp_MM - 1968.34) < 50)”.branches_to_save: list of branch names to be saved. If not specified, all branches of the input TTree will be saved. (optional)
Code
__author__ = "Anfeng Li, Jamie Gooding"
__copyright__ = "Copyright 2024, Anfeng Li"
__email__ = "anfeng.li@cern.ch, jamie.gooding@cern.ch"
__license__ = "MIT"
from typing import Dict, List
import ROOT
if snakemake.threads > 1:
ROOT.EnableImplicitMT(snakemake.threads)
else:
ROOT.DisableImplicitMT()
# Parse criteria
_smk_criteria = snakemake.params.criteria
if isinstance(_smk_criteria, str):
criteria = [_smk_criteria]
labels = [_smk_criteria]
elif isinstance(_smk_criteria, List):
criteria = _smk_criteria
labels = _smk_criteria
elif isinstance(_smk_criteria, Dict):
criteria = _smk_criteria.values()
labels = _smk_criteria.keys()
else:
raise TypeError("Parameter 'criteria' should be type of 'str', 'list' or 'dict'")
branches_to_save = snakemake.params.get("branches_to_save", None)
df = ROOT.RDataFrame(snakemake.params.input_tree_name, snakemake.input[0])
for criterion, label in zip(criteria, labels):
df = df.Filter(criterion, label)
if snakemake.params.get("verbose", False):
report = df.Report()
if branches_to_save is not None:
df.Snapshot(
snakemake.params.output_tree_name,
snakemake.output[0],
branches_to_save,
)
else:
df.Snapshot(snakemake.params.output_tree_name, snakemake.output[0])
if snakemake.params.get("verbose", False):
report.Print()