Master SAS in 30 days!

A jargon-free, easy-to-learn SAS base course that is tailor-made for students with no prior knowledge of SAS.

# 15 Ways to use Proc Means in SAS

Looking to analyze your data with Proc Means but don’t know how to start?

No worries. In this article, we will show you 15 different ways to analyze your data using the MEANS procedure.

You will learn how to compute descriptive statistics and export the analysis results to an external file.

Let’s get started!

## Software

Before we continue, make sure you have access to SAS Studio. It’s free! Data Sets

The examples used in this article are based on the CARS data set from the SASHelp library. You can find the CARS data set from the sashelp library: Variables

The CARS data set contains 15 variables related to the price, cost, make, model and specifications of a list of cars. In this article, we will show you how you can use Proc Means to analyze the MSRP (i.e., Manufacturer’s Suggested Retail Price) for each car maker, model and type of car: Of course, you will be able to use the same techniques to analyze your own data sets for your work projects.

1. Basic Structure

Let’s first run the MEANS procedure on the sashelp.cars data set:

Proc Means Data=SASHelp.cars;
Run;

​The basic form of Proc Means computes a set of descriptive statistics: The descriptive statistics are computed for all the numeric variables in the data set. By default, the statistics N, Mean, Standard Deviation, Minimum and Maximum are computed: 2. Selecting Variables for Your Analysis

​Sometimes you might be interested in only a few selected variables. ​ You can add the VAR statement to limit the analysis to only the variables you are interested in analyzing.
Proc Means Data=SASHelp.cars;
Var MSRP Invoice;

Run;

The VAR statement above limits the analysis to only the MSRP and INVOICE variables. No results are computed for any other variables in the data set. Note: You can analyze only numeric variables with the MEANS procedure. Running the Proc Means on a character variable will give you an error.

3. Requesting Specific Statistics

Getting the mean, standard deviation, minimum and maximum is nice. However, you might also want to compute additional statistics for your analysis.

Example 1: Lower Quartile, Median and Upper Quartile

Proc Means Data=SASHelp.cars Q1 Median Q3;
Var MSRP Invoice;

Run;
Adding the Q1, Median and Q3 keywords tells SAS to compute the Lower Quartile, Median and Upper Quartile: ​Example 2: Mean, Standard Error and 95% Confidence Limits

Proc Means Data=SASHelp.cars Mean Stderr CLM;
Var MSRP Invoice;
Run;
Adding the MEAN, STDERR and CLM keywords computes the mean, standard error and 95% confidence limits: Following is the complete list of statistics keywords that can be used in Proc Means:

• CLM
• NMISS
• CSS
• RANGE
• CV
• SKEWNESS|SKEW
• KURTOSIS|KURT
• STDDEV|STD
• LCLM
• STDERR​
• MAX
• SUM
• MEAN
• SUMWGT
• MIN
• UCLM​
• MODE
• USS
• N
• VAR
• MEDIAN|P50
• P1, P5, P10, P25, P75, P90, P95, P99
• QRANGE
• PROBT|PRTT

4. Display Different Decimal Places

You can also specify the number of decimal places to display for the statistics using the MAXDEC= option.

Proc Means Data=SASHelp.cars maxdec=0;
Var MSRP Invoice;
Run;

The MAXDEC=0 option tells SAS to not display any decimal places.

The analysis results are all integers. You can also display, say, 2 decimal places by adding the MAXDEX=2 option:

Proc Means Data=SASHelp.cars maxdec=2;
Var MSRP Invoice;
Run; The price of a Porsche is likely to be very different from that of a Toyota. Thus, it makes more sense to separate the analysis for each car maker. A CLASS statement can be added to the MEANS procedure to group your analysis:
Proc Means Data=SASHelp.cars;
Class Make;
Var MSRP Invoice;
Run;
By specifying the variable MAKE as the classification variable, there will be a separate analysis completed for each car maker. You can do the same for your own data as well. Use the CLASS statement to separate the analysis for different categories of your data.

## Do you have a hard time learning SAS?

Take our Practical SAS Training Course for Absolute Beginners and learn how to write your first SAS program!

There is no limit to how many classification variables you can add to your analysis.

Adding two classification variables to the CLASS statement enables you to group your analysis into multiple levels:

Proc Means Data=SASHelp.cars;
Class Make Type;
Var MSRP Invoice;
Run;

By adding both the variables MAKE and TYPE to the CLASS statement, you can analyze the data for each combination of car maker and the types of cars they produce: 7. Changing the Displayed Order of the Classification Variable

There is also an option to change the displayed order of the classification variables.

Example: Order=Freq Option

Proc Means Data=SASHelp.cars ;
Class Make / order=freq;
Var MSRP Invoice;
Run;
The ORDER=FREQ option tells SAS to order the variable MAKE from the highest frequency to the lowest. You can also order the classification variable in descending alphabetical order.

Example: Descending Option
Proc Means Data=SASHelp.cars ;
Class Make / descending;
Var MSRP Invoice;
Run;

​The DESCENDING option displays the MAKE variable in descending alphabetical order. 8. Analyze a Subset of the Observations

Let’s assume your two favourite car makers are BMW and Audi.

You are hoping to compute the statistics for these two brands only.  The WHERE statement can be used to limit your analysis to the observations from these brands.
Proc Means Data=SASHelp.cars ;
Class Make;
Var MSRP Invoice;
Where Make in (“BMW” “Audi”);
Run;
The WHERE statement above defines the subset as “BMW” or “Audi” only. Only these two brands of cars are being analyzed. 9. Create an Output Data Set

You can also save the analysis results in an output data set using the OUTPUT statement.

Proc Means Data=SASHelp.cars ;
Var Invoice;
Output Out=OutStat;
Run;

​The OUTPUT statement above creates an output data set called OUTSTAT: By default, the OUTSTAT data set contains the N, Mean, Standard Deviation, Minimum and Maximum statistics for the INVOICE variable: When running the code above, the results are also printed on the Results Window by default: If you don’t need the result printed on the Results Window, you can suppress it by adding the NOPRINT option.
Proc Means Data=SASHelp.cars noprint;
Var Invoice;
Output Out=OutStat;
Run;
No analysis result will be printed on the Results window.

10. Requesting Additional Statistics in the Output Data Set​

The OUTPUT statement also allows you to specify the statistics to be included in the output data set.

Example 1: Mean option

Proc Means Data=SASHelp.cars noprint;
Var Invoice;
Output Out = OutStat Mean = Mean1;
Run;
The Mean = Mean1 option tells SAS to include the mean statistics in the output data set.

The name of the variable is called MEAN1: Example 2: Q1, Median and Q3 options

You can also request additional statistics such as lower quartile and upper quartile:

Proc Means Data=SASHelp.cars noprint;
Var Invoice;
Output Out = OutStat Q1=LowerQ Median=Median Q3=UpperQ;

Run;

The Q1=, Median= and Q3= options compute the lower quartile, median and upper quartile in the output data set: The list of statistics options or keywords are the same as in #3 above:

​CLM, NMISS, CSS, RANGE, CV, SKEWNESS|SKEW, KURTOSIS|KURT, STDDEV|STD, LCLM, STDERR, MAX, SUM, MEAN, SUMWGT, MIN, UCLM, MODE, USS, N, VAR, MEDIAN|P50, P1, QRANGE, PROBT|PRTT.

11. Autoname the Output Variables

In #10 above, we name the output variables MEAN1, LOWERQ, MEDIAN and UPPERQ.

Naming the variables is not necessary for Proc Means.

The AUTONAME option can be used and SAS will automatically name the variables for the statistics requested:

Proc Means Data=SASHelp.cars noprint;
Var Invoice;
Output Out = OutStat Mean= STD= / autoname;

Run;
The code above requests the mean and standard deviation to be computed.

The AUTONAME option is added. The variables are automatically assigned the name of INVOICE_MEAN and INVOICE_STDDEV by SAS, respectively. ## Become a Certified SAS Specialist

Get access to two SAS base certification prep courses and 150+ practice exercises

12. Analyze Multiple Variables Within a Single Output Statement (Advance)

You can even compute statistics for multiple variables within a single OUTPUT statement.

Proc Means Data=SASHelp.cars noprint;
Var MSRP Invoice HorsePower;
Output Out=OutStat
Mean(MSRP)= Mean(Invoice)= Mean(Horsepower)= / autoname;
Run;

The Mean(MSRP)= option computes the mean of the MSRP. The same applies to the Mean(Invoice)= option and the Mean(Horsepower)= option. 13. Understand the _TYPE_ Variable in the Output Data Set

The _TYPE_ variable is automatically created in the OUTPUT data set from the MEANS procedure.

It is used to identify the combination of classification values.

Let’s look at an example.

Example 1: No Classification (i.e., no CLASS statement)

Proc Means Data=SASHelp.cars noprint;
Var MSRP;
Output Out=OutStat Mean= STD= /autoname;
Run;
The MEANS procedure above does not have a CLASS statement. The _TYPE_ variable is 0 for the one observation in the output data set. Example 2: One Classification Variable

Proc Means Data=SASHelp.cars noprint;
Var MSRP;
Class Origin;
Output Out=OutStat Mean= STD= /autoname;

Run;

The MEANS procedure above has 1 classification variable (i.e., ORIGIN). There are two groups of statistics generated in the output data set:

• _TYPE_ = 0
_TYPE_ = 1

​The observation with _TYPE_ = 0 identifies the “overall” analysis. The statistics are computed for all values with no classification level. ​​The observations with _TYPE_ = 1 identify the analysis for each classification level. The statistics, in our example, are computed for each car origin: Example 3: Two Classification Variables
Proc Means Data=SASHelp.cars noprint;
Var MSRP;
Class Origin DriveTrain;
Output Out=OutStat Mean= STD= /autoname;

Run;

The MEANS procedure above has a CLASS statement with two classification variables (i.e., ORIGIN and DRIVETRAIN). In total, there are 4 groups of statistics generated in the output data set:

_TYPE_ = 0
_TYPE_ = 1
_TYPE_ = 2
_TYPE_ = 3

Again, when _TYPE_ = 0, the statistics computed are for the overall analysis. No classification level is used. When _TYPE_ = 1, the statistics are computed for each of the DRIVETRAIN levels (i.e., All, Front and Rear). The classification from the ORIGIN variable is not considered. When _TYPE_ = 2, the statistics are computed for each of the ORIGIN levels (i.e., Asia, Europe, USA). The classification from the DRIVETRAIN variable is not considered. Finally, when _TYPE_ = 3, the statistics are computed for each combination of the ORIGIN and DRIVETRAIN values. Both of the classification variables are used. 14. Simplify the Output Data Set with NWAY Option

More often than not, you want to get statistics for each combination of the classification values only (e.g., ORIGIN x DRIVETRAIN).

You might not care about the overall analysis or any individual classification variable alone (e.g., ORIGIN alone or DRIVETRAIN alone).

You can use the NWAY option to remove these statistics from the output data set.

Proc Means Data=SASHelp.cars noprint Nway;
Var MSRP;
Class Origin DriveTrain;
Output Out=OutStat Mean= STD= /autoname;

Run;

The NWAY option tells SAS to keep only the observations where the variable _TYPE_ has the highest value.

In our example, only _TYPE_=3 will be kept: 15. Printing the Results to an External PDF File

You can easily print the statistical results to an external file such as PDF or RTF using ODS (Output Delivery System).

ODS PDF File=’/folders/myfolders/Means_Result.PDF’;
Proc Means Data=SASHelp.cars;
Var MSRP;
Class Origin DriveTrain;
Run;
ODS PDF Close;

The ODS statement above prints the results from the MEANS procedure to an external PDF file. You can also export the results to a RTF file.

ODS RTF File=’/folders/myfolders/Means_Result.RTF‘;
Proc Means Data=SASHelp.cars;
Var MSRP;
Class Origin DriveTrain;
Run;
ODS RTF Close;

Simply replace the PDF keyword to RTF and you will be able to print the results to a RTF file. ## Master SAS in 30 Days

Article Rating
Subscribe
Notify of Inline Feedbacks Bill
1 year ago

I have a question regarding use of the MISSING statement. Should we use this statement when there are blanks in the CLASS or the VAR?

Many thanks. kareem
1 year ago

nice explanation ## SAS Base Certification Exam Prep Course Two Certificate Prep Courses and 300+ Practice Exercises