案例bussiness之finance案例sas BAFI 435 – Empirical Fina
当前位置:以往案例 > >案例bussiness之finance案例sas BAFI 435 – Empirical Fina
2017-07-05

Case Western Reserve University Weatherhead School of Management

BAFI 435 – Empirical Finance Fall 2018


project 3 (All Sessions) Due Wed, September 19th, 2018


This project will explore data extraction from WRDS. The objective is to prepare the datasets that we will use in Module 4. Jumping ahead to Module 4, we will replicate a study reported in CLM about market reactions to earnings announcements. The main figure of the study is shown below:

image.png

Figure 1. Market reactions to earnings announcements from Dow Jones’ firms, 1989-1993.


More specifically, we will look at market reactions to earnings announcements. Our sample will be quarterly earnings announcements for the 30 firms in the Dow Jones Index Average (DJIA), or simply Dow Jones index, between 1989 and 1993.

The project is an example of an empirical examination that depends on data collection from many different sources. Here, the following “files” will be involved:

· The list of firms that have been part of the Dow Jones Index

· Data on earnings announcements for those firms

· Data on market expectations about earnings to be released

· Data on returns for the firms

· Data on returns for the market.



Let’s proceed step by step.

1. I collected for you the components of the Dow Jones index between 1989 and 1993. The following table shows the data. For each component, I show the PERMNO, CUSIP, and the period the company was part of the index (that is between INIT_DATE and END_DATE). This table is saved as a simple text file DJIA_COMPONENTS_1989_1993.txt, available on Canvas.

image.png

Table 1. Dow Jones' components, 1989-1993



Read this data into a SAS code. You can either read the file directly or copy and paste the contents of the file into your code and use the DATALINES statement. Force the CUSIP variable to be of the character datatype. Also, make sure you read the dates correctly by using the proper informat. After reading the dataset into your SAS code, print it to make sure the import worked.

Hint: notice that there are more than 30 entries in the table above. This happens because companies go in and out of the DJIA index. See, for example, that PERMNO=15069 exited the index in May 5th, 1991.

2. Now let’s work on the earnings data. Earnings data is collected by the IBES database in WRDS, whose physical location is ‘/wrds/ibes/sasdata’. The first info we will collect refer to the quarterly earnings announcements. The IBES dataset ACTU_EPSUS contains such information. Create a library in your SAS code with a logical reference to the IBES directory and proceed with a PROC CONTENTS on the ACTU_EPSUS dataset. Take a look at the output. In particular, notice how many records the dataset has.

You will see a bunch of variables. What you won’t see is a PERMNO variable. Instead, the identifiers for a company in IBES are CUSIP and OFTIC (the official ticker for the firm). Fortunately, you already have the CUSIP for the Dow Jones components, so you can search for a company’s earnings using its CUSIP.

a) Start the data collection by obtaining data from ACTU_EPSUS for the sample of firms present in the Dow Jones index between 1989 and 1993. That is, combine the ACTU_EPSUS with your Dow Jones file based on the variable CUSIP, so that only observations from the ACTU_EPSUS that are also on the Dow Jones file are kept.

You should try a PROC SQL with a CREATE command to create such a table, similar to what we have done in Section 5 of Module 3. The PROC SQL should bring all variables from each dataset. The WHERE condition should match the CUSIP variable in each dataset. Look at the log file to note how many records were created in this new table.

b) The final earnings table should have 600 observations (30 components of Dow Jones each quarter times 20 quarters). As of now, we have many more observations. This is because we did not put any other restriction on the data collection. For example, for PERMNO=10145 we are bringing many more earnings (annual, quarterly, before 1989, after 1993, etc) than the 20 quarters we need from 1989 thru 1993. Our next step is to filter out the data we do not need. Here they are:

· The variable PENDS from ACTU_EPSUS indicates the quarter of the earnings observation. It is a variable stored as a date, more specifically, the last date of the calendar quarter to which that earnings data refers to. For example, earnings for the 2nd quarter of 2005 (which finishes in the end of June) will have PENDS=”30JUN2005”d. Thus, if I need the earnings data for the year 2005, I would use a restriction in my data collection as in “01JAN2005”d<=PENDS<=”31DEC2005”d. For our sample, Use the initial and end date from the Dow Jones table to restrict the quarters you are accessing; that is, you will need to write a filter such as INIT_DATE<=PENDS<=END_DATE.

· The variable PDICITY from ACTU_EPSUS indicates periodicity of the earnings data. Companies release both quarterly (PDICITY=”QTR”) and annual



(PDICITY=”ANN”) earnings. Since our study uses quarterly earnings, you will need to restrict your data collection to observations with PDICITY=”QTR”.

· Earnings data includes information on various earnings measures, which are collected in different observations, as recorded in the variable MEASURE from ACTU_EPSUS: earnings per share (MEASURE=”EPS”), cash flows per share (MEASURE=”CFS”), etc. We will look at earnings per share only, thus restrict your data collection to observations with MEASURE=”EPS”.


Apply the three restrictions above over the dataset you created in step a). Use a WHERE condition in a DATA step. For example, if the table from a) is named D, create a table E as


data e;

set d

where;


Verify that your new table ends up with 600 observations. The additional variables you will need—besides the ones identifying the earnings (PERMNO, CUSIP, and PENDS)— are: the exact date of the announcement (ANNDATS from ACTU_EPSUS) and the actual value of the earnings that were announced (VALUE from ACTU_EPSUS). Thus, keep in your dataset only these variables. The dataset you want to create has the following format:

image.png

3. IBES also records the output of sell-side analysts. In particular, IBES records detailed earnings forecasts by sell-side analysts. Besides detailed data, IBES records summary statistics on these forecasts, such as the mean forecast by these analysts. We will need these mean forecasts as a proxy for the market expectation on upcoming earnings.

Let’s say that we wanted to know what was the market expectation, as of June 30, 2005, about the 2nd quarter of 2005’s earnings for IBM. This 2nd quarter’s earnings would be released sometime after June 30, 2005 (which is the end of the 2nd quarter). All we need to do is to look for the summary statistics on earnings forecasts for IBM around that specific date.

Data on summary statistics on earnings forecasts is stored in the STATSUMU_EPSUS dataset of the IBES database. Please use the PROC CONTENTS on this dataset. An observation from this dataset is identified by the firm’s identifier (OFTIC, CUSIP), by the fiscal quarter it refers to (FPEDATS, equivalent to the PENDS variable in the ACTU_EPSUS dataset), by the type of the earnings number (FISCALP, equivalent to the PDICITY variable in the ACTU_EPSUS dataset), by the measure (MEASURE, as in the



ACTU_EPSUS dataset), and by the date the summary statistics was computed (STATPERS variable).

Thus, in our example above for IBM, I would look for the observations in the STATSUMU_EPSUS dataset that attend to the following clause:

cusip=”45920010” and measure=”EPS” and fiscalp=”QTR” and fpedats=”30JUN2005”d and statpers<=”30JUN2005”d;

a) Please try to collect the summary statistics data for IBM according to the above conditions. Please print these observations. How many such observations are available?

You will notice that the clause above reads all summary statistics that were generated for that quarter for IBM, that is, potentially one generated in March 2005, another in April 2005, etc. An additional step would be to clean up the data to keep the one that is closest to June 30, 2005, that is, simply the latest such observation. In this case, it would be the observation with the STATPERS closest to June 30, 2005. What is the date (STATPERS) of this observation?

b) Now, for the sample of earnings data you obtained in step 2, collect summary statistics on earnings forecasts that are available before the end of the fiscal quarter. Here you should use a PROC SQL so that you can match the summary stats dataset with the earnings dataset, based on CUSIP and the dates. For example, the search condition should include a restriction that the STATPERS in the summary stats dataset is smaller than or equal to the FPEDATS variable in the earnings dataset).

The next step is to keep only the latest summary statistics for each quarter. The main variable you will need from the STATSUMU_EPSUS dataset is MEANEST (the average earnings forecast).

Now, a hint on how you can “keep only the latest summary statistics for each quarter.” There are many ways to accomplish this, but here is one.

Suppose you have a dataset D with all summary statistics for a quarter. Each observation is identified by the variables PERMNO, PENDS, and STATPERS. The idea is to use the NODUPKEYS clause in a PROC SORT; the clause keeps only one observation per set of variables identified in the BY clause. The code should be written as shown below:

proc sort data=d;

by permno pends descending statpers;


proc sort data=d nodupkeys; by permno pends;


This code sorts the D dataset by PERMNO, PENDS, then by the date the summary statistics was taken. The DESCENDING clause tells to sort by STATPERS

在线提交订单