S&P COMPUSTAT

Description

Standard & Poor's Compustat provides the annual and quarterly Income Statement, Balance Sheet, Statement of Cash Flows,
and supplemental data items on most publicly held companies in North America. Financial data items are collected from a wide
variety of sources including news wire services, news releases, shareholder reports, direct company contacts, and quarterly and
annual documents filed with the Securities and Exchange Commission. Compustat files also contain information on aggregates,
industry segments, banks, market prices, dividends, and earnings. Depending upon the data set, coverage may extend as far back
as 1950 through the most recent year-end.

UNT’s subscription to Compustat includes the files listed below:

Primary, Supplementary, Tertiary (pstann, pstqtr):
Contains the largest companies on the NYSE and AMEX, including
all companies comprising the S&P Industrial Index, and companies listed on major exchanges.

Full coverage files (fca, fcq):
Includes companies listed in NASDAQ, regional exchanges, publicly held companies trading
common stock and wholly owned subsidiaries trading preferred stock or debt.

Research files (mannr, mqtr): Contain companies that have been deleted from the Industrial files [Primary, Supplementary, Tertiary]
and the Full-Coverage files due to acquisition, merger, bankruptcy, liquidation, reverse acquisition, leveraged buyout, or because
they became a private company. The “Current” data file covers the most recent 20 years and is updated annually. There are two
additional research data files, one that covers the previous 20 years (currently, 1961-1980, updated annually, called “Backdata”)
and one that covers 1950 through 1969. The last one is not updated.

Bank files (bna, bnq): Data on about 600-700 banking institutions.

Canadian Files (cdnann, cdnqtr): Contains more than 900 major Canadian industrial companies that report in Canadian currency.

Prices, Dividends and Earnings files (pde): Contain market information and about 120 industry indexes and composites.


Apply for a Sol Account

Compustat data sets are maintained on sol.acs.unt.edu, Academic Computing Services (ACS) multiuser research UNIX system.
To access Compustat data, you must have an account on sol. You may apply for sol access via the UNT Internet Services
Account Management page (EUID and password required).  Students must have a faculty sponsor in order to gain access to sol.
Currently, UNT users can only access Compustat data using either SAS (Sol version) or Research Insight.   
 

Accessing COMPSTAT data

First, you need to log into Sol using PuTTY:

Dr. Philip Baczewski, Associate Director of Academic Computing, has provided a sample SAS program using proc datasource to
read Compustat annual data based on the example found at http://ftp.sas.com/techsup/download/base/cs2003help.8.doc  This
sample program intends to extract the entire Compustat data item set (see the manual for more information about specific data item
numbers) for companies with SIC code of "7370" or "7371."

This sample SAS program is located in the /export/data/compustat/unt folder on Sol.  For more sample
programs, click here.

To access the sample program:

(1)  Copy the sample program from the "unt" directory to your home directory.  The UNIX command for copying files is cp.
       For example, the command
"cp /export/data/compustat/unt/ds_fcann.sas ." copies the sample SAS program to your home
       directory.  For a list of common used UNIX commands, please refer to the UNIX documentation

       <NOTE>
       ds_fcann.sas is the default program.  In order to extract a specific industry's data, you need to modify this SAS program.
       (See the sample program code below; codes in
bold are additions to the sample program)

       Alternatively, you can use WinSCP program to transfer files between Sol and your computer. WinSCP documentation is
       available and can be found here.

        

(2) To run the sample program, use the "SAS" command.  (This may take several  minutes)

     


(3) After the program is processed, two output files will be written in your current directory (i.e., your home directory). These files
     are called "fcannsub.export" and "ds_fcann.log"  The .export file is the output file and the .log file is the SAS log file.  If
     programming errorrs occur while SAS processes the sample program, the corresponding error messages will be stored in the
     .log file, and the .export file would not be generated.  To view the content of the files, use the "less" command.


Compustat Sample Program - To extract the entire Compustat data item set for companies with SIC code of "7370" or "7371."

/*---------------------------------------------------------------*
 * access a SAS-supplied SAS catalog dataset which supports   *
 * reading Compustat ASCII data created after April, 2003        *
 *---------------------------------------------------------------*/

    libname lib1 "/export/data/compustat/saslib/";

/*---------------------------------------------------------------*
 * access the Compustat annual data                                          *
 *---------------------------------------------------------------*/

    filename data2003 '/export/data/compustat/fcann'
             RECFM=F LRECL=13612;

 /*--------------------------------------------------------------*
  * create OUT=csauy3 data set with ASCII 2003 Industrial Data   *
  *  The "where" statement filters out data for Computing Programming*
  *  Service Companies (dnum = 7371) and Prepackaged Software *
  *  Companies (dnum = 7370)                                                      *
  *--------------------------------------------------------------*/

     proc datasource filetype=csaucy3  c=lib1.datafmt ascii
                     infile=data2003
                     interval=year
                     outselect=on
               /*      outcont=y3cont   */
                     outkey=y3key
               /*       outall=y3all    */
                     out=csauy3;
        keep _all_;
      /* The keep _all_ statement means that we are keeping all Compustat variables--*/
 
        where dnum in (7370,7371);

      /*-- dnum is used to extract companies in a specific industry group--*/

      /* -- If you were to extract a specific company, use cnum (meaning CUSIP)--*/

        run;

     proc sort data=y3key out=y3key;
        by dnum   cnum   cic  ein;
        run;

     proc sort data=csauy3 out=csauy3;
        by dnum   cnum   cic  file  zlist  smbl   xrel  stk;
        run;

     proc print data=y3key;
        title2 'OUTBY= 2003 data set';
     run;

     proc contents data=csauy3;
        title2 'CONTENTS of 2003 data set';
     run;

  /*-- The Out= data set is huge, so we're just printing part --*/
  /*-- of it ---------------------------------------------------*/
  options obs=50;

     proc print data=csauy3;
        title2 'OUT= 2003 Annual DATA SET';
        run;

   /*-- This step creates a file call "fcannsub.export" that is readable by SAS --*/
   /* -- Creating an export data set on a data step --*/ 
   libname xportout xport 'fcannsub.export';
      data xportout.fcann;
        set csauy3;
         run;


How to transport the fcannsub.export to your local SAS (after obtaining fcannsub.export from Sol)

  1. Winscp "fcannsub.export" from your Sol account to local hard drive. 

  2. Write a "reversed" SAS program that retrieves the data from local SAS:

    libname xportin xport "H:\COBA RA\fcannsub.export";

    data transport3;

    set xportin.fcann;

    run;


    <NOTE>
     
    This short SAS program will read the information in "fcannsub.export"
    and store the data in the SAS "WORK" directory.  The new file is called
    "WORK.TRANSPORT3"

       


Partial Output (The partial output has the first 31 observations):
After obtaining the SAS output, use SAS "export wizard" to export the output to Microsoft Excel.





Additional Compustat Sample Programs

Compustat Sample Programs List
(These programs intend to extract Compustat data for a portfolio of cusips)


Current: (1983-2002)

ds_current.sas combines fcann, pstann, and mannr


Current + b1: (1964-2002)

ds_currentb1.sas combines fcann, fcannb1, pstann, pstannb1, mannr, mannrb1


"Way-back" data: (1950-1969)

ds_allb2.sas combines fcannb2, pstannb2, mannrb2
 

Full Coverage Data (Fca): (To extract Fca data for a group of cusips)

ds_fcannall.sas Combines fcann, fcannb1, and fcannb2 (1950-2002)
ds_fcann+1.sas Combines fcann, fcannb1 (1964-2002)
ds_fcannc.sas Accesses fcann (1983-2002)
 

Primary, Supplementary, Tertiary Sample Programs (Pst): (To extract Pst data for a group of cusips)

ds_pstannall.sas Combines pstann, pstannb1, and pstannb2 (1950-2002)
ds_pstann+b1.sas Combines pstann, pstannb1 (1964-2002)
ds_pstannc.sas Accesses pstann (1983-2002)


Research Files (Mannr): (To extract Mannr data for a group of cusips)

ds_mannrall.sas Combines mannr, mannrb1, and mannrb2 (1950-2002)
ds_mannrc1.sas Combines mannr, mannrb1(1964-2002)
 

Proc Datasource Procedures

The DATASOURCE procedure extracts time series data from many different kinds of data files distributed by various data  vendors and stores them in a SAS data set. Once stored in a SAS data set, the time series variables can be processed by other SAS procedures.

The DATASOURCE procedure has statements and options to extract only a subset of time series data from an input data file. It gives you control over the frequency of data to be extracted, time series variables to be selected, cross sections to be included, and the time range of data to be output.

UNT's SAS/ETS User's Guide, click here.
UNT's Proc Datasource Documentation page, click here.

 

COMPUSTAT Manuals

COMPUSTAT (North America) Technical Guide

Cover Page
Table of Contents
Chapter 1 - Introduction
Chapter 2 - About the COMPUSTAT IBM 360/370 General File Format
Chapter 3 - COMPUSTAT IBM 360/370 General File Formats
Chapter 4 - COMPUSTAT Prices, Dividends, and Earnings IBM 360/370 General File Formats
Chapter 5 - About the COMPUSTAT Character ASCII File Formats
Chapter 6 - COMPUSTAT Character ASCII File Formats
Chapter 7 - Revised COMPUSTAT Business Information File Character ASCII File Formats
Chapter 8 - COMPUSTAT Prices, Dividends, and Earnings Character ASCII File Formats
Chapter 9 - COMPUSTAT Transaction File Formats
Chapter 10 - COMPUSTAT Additional Files
Chapter 11 - Reference

 

COMPUSTAT (North America) User's Guide

Cover Page
Table of Contents
Chapter 1 - Introduction
Chapter 2 - Understanding the COMPUSTAT database
Chapter 3 - Financial Formulas
Chapter 4 - Financial Statements
Chapter 5 - Data Definitions I
Chapter 5 - Data Definitions II
Chapter 5 - Data Definitions III
Chapter 5 - Data Definitions IV 
Chapter 6 - Footnotes
Chapter 7 - Combined Data Items
Chapter 8 - Reference I
Chapter 8 - Reference II
Chapter 8 - Reference III

 


Source: Kellogg Research Computing, Northwestern University.  http://www.kellogg.northwestern.edu/researchcomputing/compustat.htm