Looking to analyze your data with Proc Means but don’t know how to start?

No worries. In this article, we will show you 15 different ways to analyze your data using the MEANS procedure.

You will learn how to compute descriptive statistics and export the analysis results to an external file.

Let’s get started!

**Data Sets**

The examples used in this article are based on the CARS data set from the SASHelp library.

You can find the CARS data set from the sashelp library:

**Variables**

The CARS data set contains 15 variables related to the price, cost, make, model and specifications of a list of cars.

In this article, we will show you how you can use Proc Means to analyze the MSRP (i.e., Manufacturer’s Suggested Retail Price) for each car maker, model and type of car:

Of course, you will be able to use the same techniques to analyze your own data sets for your work projects.

**1. Basic Structure**

Let’s first run the MEANS procedure on the sashelp.cars data set:

**SASHelp.cars**;

Run;

The basic form of Proc Means computes a set of descriptive statistics:

__numeric__variables in the data set.

**2. Selecting Variables for Your Analysis**

Sometimes you might be interested in only a few selected variables.

**Var MSRP Invoice;**

Run;

The VAR statement above limits the analysis to only the MSRP and INVOICE variables. No results are computed for any other variables in the data set.

Note: You can analyze only numeric variables with the MEANS procedure. Running the Proc Means on a character variable will give you an error.

**3. Requesting Specific Statistics**

Getting the mean, standard deviation, minimum and maximum is nice. However, you might also want to compute additional statistics for your analysis.

You can request additional statistics by adding the corresponding statistics keywords.

Example 1: Lower Quartile, Median and Upper Quartile

**Q1 Median Q3**;

Var MSRP Invoice;

Run;

Example 2: Mean, Standard Error and 95% Confidence Limits

**Mean Stderr CLM**;

Var MSRP Invoice;

Run;

Following is the complete list of statistics keywords that can be used in Proc Means:

- CLM
- NMISS
- CSS
- RANGE
- CV
- SKEWNESS|SKEW
- KURTOSIS|KURT
- STDDEV|STD
- LCLM
- STDERR
- MAX
- SUM
- MEAN
- SUMWGT
- MIN
- UCLM
- MODE
- USS
- N
- VAR
- MEDIAN|P50
- P1, P5, P10, P25, P75, P90, P95, P99
- QRANGE
- PROBT|PRTT

**4. Display Different Decimal Places**

You can also specify the number of decimal places to display for the statistics using the MAXDEC= option.

**maxdec=0**;

Var MSRP Invoice;

Run;

The MAXDEC=0 option tells SAS to not display any decimal places.

The analysis results are all integers.

You can also display, say, 2 decimal places by adding the MAXDEX=2 option:

**maxdec=2**;

Var MSRP Invoice;

Run;

**5. Group Your Analysis**

The price of a Porsche is likely to be very different from that of a Toyota. Thus, it makes more sense to separate the analysis for each car maker.

**Class Make;**

Var MSRP Invoice;

Run;

You can do the same for your own data as well. Use the CLASS statement to separate the analysis for different categories of your data.

## Do you have a hard time learning SAS?

Take our Practical SAS Training Course for **Absolute Beginners** and learn how to write your first SAS program!

**6. Adding Multiple Classification Variables**

There is no limit to how many classification variables you can add to your analysis.

Adding two classification variables to the CLASS statement enables you to group your analysis into multiple levels:

Class

**Make Type;**

Var MSRP Invoice;

Run;

By adding both the variables MAKE and TYPE to the CLASS statement, you can analyze the data for each combination of car maker and the types of cars they produce:

**7. Changing the Displayed Order of the Classification Variable**

There is also an option to change the displayed order of the classification variables.

Example: Order=Freq Option

Class Make /

**order=freq**;

Var MSRP Invoice;

Run;

__Example: Descending Option__

Class Make /

**descending**;

Var MSRP Invoice;

Run;

The DESCENDING option displays the MAKE variable in descending alphabetical order.

**8. Analyze a Subset of the Observations**

Let’s assume your two favourite car makers are BMW and Audi.

You are hoping to compute the statistics for these two brands only.

Class Make;

Var MSRP Invoice;

**Where Make in (“BMW” “Audi”);**

Run;

**9. Create an Output Data Set**

You can also save the analysis results in an output data set using the OUTPUT statement.

Var Invoice;

**Output Out=OutStat;**

Run;

The OUTPUT statement above creates an output data set called OUTSTAT:

**noprint**;

Var Invoice;

Output Out=OutStat;

Run;

**10. Requesting Additional Statistics in the Output Data Set**

The OUTPUT statement also allows you to specify the statistics to be included in the output data set.

Example 1: Mean option

Var Invoice;

Output Out = OutStat

**Mean = Mean1**;

Run;

**Mean = Mean1**option tells SAS to include the mean statistics in the output data set.

The name of the variable is called MEAN1:

Example 2: Q1, Median and Q3 options

You can also request additional statistics such as lower quartile and upper quartile:

Var Invoice;

Output Out = OutStat

**Q1=LowerQ Median=Median Q3=UpperQ;**

Run;

The Q1=, Median= and Q3= options compute the lower quartile, median and upper quartile in the output data set:

The list of statistics options or keywords are the same as in #3 above:

CLM, NMISS, CSS, RANGE, CV, SKEWNESS|SKEW, KURTOSIS|KURT, STDDEV|STD, LCLM, STDERR, MAX, SUM, MEAN, SUMWGT, MIN, UCLM, MODE, USS, N, VAR, MEDIAN|P50, P1, QRANGE, PROBT|PRTT.

**11. Autoname the Output Variables**

In #10 above, we name the output variables MEAN1, LOWERQ, MEDIAN and UPPERQ.

Naming the variables is not necessary for Proc Means.

The AUTONAME option can be used and SAS will automatically name the variables for the statistics requested:

Var Invoice;

Output Out = OutStat Mean= STD=

**/ autoname;**

Run;

The AUTONAME option is added. The variables are automatically assigned the name of INVOICE_MEAN and INVOICE_STDDEV by SAS, respectively.

## Become a Certified SAS Specialist

Get access to two SAS base certification prep courses and 150+ practice exercises

**12. Analyze Multiple Variables Within a Single Output Statement (Advance)**

You can even compute statistics for multiple variables within a single OUTPUT statement.

Var MSRP Invoice HorsePower;

Output Out=OutStat

**Mean(MSRP)= Mean(Invoice)= Mean(Horsepower)= / autoname;**

Run;

The Mean(MSRP)= option computes the mean of the MSRP. The same applies to the Mean(Invoice)= option and the Mean(Horsepower)= option.

**13. Understand the _TYPE_ Variable in the Output Data Set**

The _TYPE_ variable is automatically created in the OUTPUT data set from the MEANS procedure.

It is used to identify the combination of classification values.

Let’s look at an example.

Example 1: No Classification (i.e., no CLASS statement)

Var MSRP;

Output Out=OutStat Mean= STD= /autoname;

Run;

Example 2: One Classification Variable

Var MSRP;

**Class Origin;**

Output Out=OutStat Mean= STD= /autoname;

Run;

The MEANS procedure above has 1 classification variable (i.e., ORIGIN). There are two groups of statistics generated in the output data set:

- _TYPE_ = 0

_TYPE_ = 1

The observation with _TYPE_ = 0 identifies the “overall” analysis. The statistics are computed for all values with no classification level.

The observations with _TYPE_ = 1 identify the analysis for each classification level. The statistics, in our example, are computed for each car origin:

__Example 3: Two Classification Variables__

Var MSRP;

**Class Origin DriveTrain;**

Output Out=OutStat Mean= STD= /autoname;

Run;

The MEANS procedure above has a CLASS statement with two classification variables (i.e., ORIGIN and DRIVETRAIN). In total, there are 4 groups of statistics generated in the output data set:

_TYPE_ = 0

_TYPE_ = 1

_TYPE_ = 2

_TYPE_ = 3

Again, when _TYPE_ = 0, the statistics computed are for the overall analysis. No classification level is used.

When _TYPE_ = 1, the statistics are computed for each of the DRIVETRAIN levels (i.e., All, Front and Rear). The classification from the ORIGIN variable is not considered.

When _TYPE_ = 2, the statistics are computed for each of the ORIGIN levels (i.e., Asia, Europe, USA). The classification from the DRIVETRAIN variable is not considered.

Finally, when _TYPE_ = 3, the statistics are computed for each combination of the ORIGIN and DRIVETRAIN values. Both of the classification variables are used.

**14. Simplify the Output Data Set with NWAY Option**

More often than not, you want to get statistics for each combination of the classification values only (e.g., ORIGIN x DRIVETRAIN).

You might not care about the overall analysis or any individual classification variable alone (e.g., ORIGIN alone or DRIVETRAIN alone).

You can use the NWAY option to remove these statistics from the output data set.

**Nway**;

Var MSRP;

Class Origin DriveTrain;

Output Out=OutStat Mean= STD= /autoname;

Run;

The NWAY option tells SAS to keep only the observations where the variable _TYPE_ has the highest value.

In our example, only _TYPE_=3 will be kept:

**15. Printing the Results to an External PDF File**

You can easily print the statistical results to an external file such as PDF or RTF using ODS (Output Delivery System).

**ODS PDF File=’/folders/myfolders/Means_Result.PDF’;**

Proc Means Data=SASHelp.cars;

Var MSRP;

Class Origin DriveTrain;

Run;

**ODS PDF Close;**

The ODS statement above prints the results from the MEANS procedure to an external PDF file.

You can also export the results to a RTF file.

**RTF**File=’/folders/myfolders/Means_Result.

**RTF**‘;

Proc Means Data=SASHelp.cars;

Var MSRP;

Class Origin DriveTrain;

Run;

ODS

**RTF**Close;

Simply replace the PDF keyword to RTF and you will be able to print the results to a RTF file.

That’s it! If you have any questions, feel free to leave a comment below.

Great advice, thank you.

I have a question regarding use of the MISSING statement. Should we use this statement when there are blanks in the CLASS or the VAR?

Many thanks.

nice explanation