DEFINE_COLUMNS

https://img.shields.io/badge/wrapper_version-v9.1.1-10785b https://img.shields.io/github/issues-pr/snakemake/snakemake-wrappers/phys/root/define_columns?label=version%20update%20pull%20requests&color=1cb481

Define columns in a TTree using RDataFrame

URL: https://root.cern/doc/master/classROOT_1_1RDataFrame.html

Example

This wrapper can be used in the following way:

rule define_columns:
    input:
        "ntuple0.root",
    output:
        "ntuple0_output.root",
    log:
        "logs/define_columns/define_columns.log",
    params:
        input_tree_name="TestTree",
        output_tree_name="TestTree",
        branches=[
            ["p2", "px * px + py * py + pz * pz"],
            ["pt", "sqrt(px * px + py * py)"],
        ],
        redefine=["pt"],
    threads: 2
    wrapper:
        "v9.1.1/phys/root/define_columns"

Note that input, output and log file paths can be chosen freely.

When running with

snakemake --use-conda

the software dependencies will be automatically deployed into an isolated environment before execution.

Software dependencies

  • root=6.38.00

Input/Output

Input:

  • TTree ROOT file

Output:

  • TTree ROOT file

Params

  • input_tree_name: name of the input TTree

  • output_tree_name: name of the output TTree

  • branches: branches to be defined, specified in the format of [[“branch_name1”, “definition_expression1”], [“branch_name2”, “definition_expression2”]]. If not specified, no branch will be defined, i.e. just save the TTree. (optional)

  • redefine: list of branch names to be redefined. It must be defined in params.branch also. (optional)

  • branches_to_save: list of branch names to be saved. If not specified, all branches of the input TTree will be saved. (optional)

Authors

  • Anfeng Li

Code

__author__ = "Anfeng Li"
__copyright__ = "Copyright 2024, Anfeng Li"
__email__ = "anfeng.li@cern.ch"
__license__ = "MIT"


import ROOT

ROOT.EnableImplicitMT(snakemake.threads)

redefine_list = snakemake.params.get("redefine", [])
branches = snakemake.params.get("branches", [])
branches_to_save = snakemake.params.get("branches_to_save", None)

df = ROOT.RDataFrame(snakemake.params.input_tree_name, snakemake.input[0])
for branch_name, branch_definition in branches:
    if branch_name in redefine_list:
        df = df.Redefine(branch_name, branch_definition)
    else:
        df = df.Define(branch_name, branch_definition)
if branches_to_save is not None:
    df.Snapshot(
        snakemake.params.output_tree_name,
        snakemake.output[0],
        branches_to_save,
    )
else:
    df.Snapshot(snakemake.params.output_tree_name, snakemake.output[0])