Summary Statistics

The Summary Statistics node runs R’s summary() function on your entire dataset, giving you a quick overview of all variables.

What it does

Applies summary() to the connected dataframe
For numeric columns: shows min, 1st quartile (\(Q_1\)), median, mean (\(\bar{x}\)), 3rd quartile (\(Q_3\)), and max
For categorical columns: shows frequency counts of each level

This is the simplest node to use — just connect data and run it.

Setting	Required	Description
Upstream connection	Yes	A node providing data
Comment	No	Annotation for generated R code

There are no user-configurable parameters beyond the upstream connection.

A summary table showing descriptive statistics for each column. For example:

summary(my_data)

This is a great first step after loading data — it quickly reveals the range, central tendency, and distribution of your variables
Look for unexpected NA values in the summary, which may indicate missing data
For more targeted statistics, use the Mean / SD node with a formula
The five-number summary (min, \(Q_1\), median, \(Q_3\), max) is the basis of a box plot