In this tutorial we shall learn 3 methods of how to rename variables in SAS using RENAME keyword or RENAME statement and the difference in the outputs.
Firstly let us define our library location:
libname mylib '/home/u50132927/My_datasets';
For this tutorial we shall be leveraging SAS' inbuilt dataset: SASHELP.SHOES.
DATA statement created a new dataset named shoes in our library mylib.
SET statement copies the rows specified in SET statement and pastes it to our new DATA.
DATA mylib.shoes;
SET SASHELP.shoes;
RUN;
We can keep variables using 3 different methods:
Using RENAME keyword in DATA step.
Using RENAME keyword in SET step.
Using RENAME Statement after SET step.
All 3 of them yield same output (provided we are not creating any new variable using those variables which are not kept)
Basic Syntax of rename:
RENAME Old_Name = New_Name
In the following 3 code chunks we are renaming variable Sales as Revenue from our data.
DATA MYLIB.SHOES (RENAME = Sales = Revenue);
SET SASHELP.SHOES;
RUN;
DATA MYLIB.SHOES ;
SET SASHELP.SHOES(RENAME = Sales = Revenue);
RUN;
DATA MYLIB.SHOES ;
SET SASHELP.SHOES;
RENAME Sales = Revenue;
RUN;
Renaming multiple variables
In the next chunks of code we are renaming multiple variables: Sales to revenue and Product to product_category
DATA MYLIB.SHOES (RENAME = (SALES = Revenue PRODUCT = Product_category));
SET SASHELP.SHOES ;
RUN;
DATA MYLIB.SHOES ;
SET SASHELP.SHOES(RENAME = (SALES = Revenue PRODUCT = Product_category)) ;
RUN;
DATA MYLIB.SHOES ;
SET SASHELP.SHOES;
RENAME SALES = Revenue PRODUCT = Product_category;
RUN;
Difference in using RENAME statement while creating a new variable
For example, let us Rename the variables Sales as Revenue in SET statement and create a new variable called MYVARIABLE = 2* SALES.
DATA MYLIB.SHOES ;
SET SASHELP.SHOES(RENAME = Sales = Revenue);
MYVARIABLE = SALES*2;
RUN;
Our new variable is containing missing values, moreover we have an extra column named SALES. While original sales column has been renamed to Revenue in the following output. Why?
Ans. SAS firstly copies the data rows and columns available in SET statement. Since we have renamed the variable SALES in our SET statement thus SALES variable does not exist for SAS. Hence MYVARIABLE cannot be computed and has NA values. Similarly it creates a new variable called SALES.
To sort this issue since in our SET statement we have renamed our variable to Revenue thus we can create MYVARIABLE = Revenue*2
DATA MYLIB.SHOES ;
SET SASHELP.SHOES(RENAME = Sales = Revenue);
MYVARIABLE = Revenue*2;
RUN;
Now, let us rename the variables Sales in DATA statement and create a new variable called MYVARIABLE = 2* SALES.
DATA MYLIB.SHOES (RENAME = Sales = Revenue);
SET SASHELP.SHOES;
MYVARIABLE = SALES*2;
RUN;
Since in SET statement SALES variable is already present thus our MYVARIABLE gets created successfully and in DATA step our SALES variable gets renamed to Revenue.
Similarly writing a RENAME Statement after SET Statement produces the new variable and renames SALES to Revenue.
DATA MYLIB.SHOES ;
SET SASHELP.SHOES;
MYVARIABLE = SALES*2;
RENAME Sales = Revenue;
RUN;
Comentarii