Mean / SD

The Mean / SD node calculates descriptive statistics — mean, standard deviation, variance, count, and standard error — grouped by one or more explanatory variables.

What it does

Computes grouped summary statistics using R’s aggregate() function
Calculates: mean (\(\bar{x}\)), standard deviation (\(s\)), variance (\(s^2\)), count (\(n\)), and standard error (\(SE = \frac{s}{\sqrt{n}}\))
Optionally stores the result as a new dataframe for downstream use

How to use it

Connect a data source — drag an edge from an Input CSV node
Enter a formula — use the format response ~ explanatory (e.g. weight ~ diet)
Optionally name the output — provide a dataframe name to store results
Click Run

Formula syntax

Formula	Meaning
`weight ~ diet`	Mean/SD of `weight` grouped by `diet`
`weight ~ diet + time`	Grouped by both `diet` and `time`

Configuration

Setting	Required	Description
Upstream connection	Yes	A node providing data
Formula	Yes	`response ~ explanatory` format
Output dataframe name	No	Store results under this name for downstream use
Comment	No	Annotation for generated R code

Output

A table showing the computed statistics for each group:

Group	mean	sd	var	n	se
A	5.03	1.21	1.46	10	0.38
B	7.89	0.95	0.90	10	0.30

Generated R code

aggregate(
  formula = weight ~ diet,
  data = chick_data,
  FUN = function(x) c(
    mean = mean(x),
    sd = sd(x),
    var = var(x),
    n = length(x),
    se = sd(x) / sqrt(length(x))
  )
)

If you provide an output dataframe name, the result is assigned:

diet_summary <- aggregate(...)

Tips

The formula variables shown as “Uses: weight, diet” below the input help you verify that column names are correct
Providing an output dataframe name lets you chain this node’s results into downstream nodes like Output CSV
The standard error \(SE = \frac{s}{\sqrt{n}}\) is useful for constructing confidence intervals: \(\bar{x} \pm t_{\alpha/2} \times SE\)