NYSE Trade and Quote (TAQ) Database Documentation

TAQ does not include transaction data that is reported outside of the Consolidated Tape hours of operation. As of
November 1999, those hours are 8:00am to 6:30pm EST. Trading in NYSE-listed securities between 8:00am – 9:30am
by other markets are also not on TAQ.

I.  Accessing TAQ via UNIX (sol)
 

      A Simple TAQ Extraction Example:
   
    
The following extraction example shows how to extract TAQ data from Sol for a specific time period and
      for a specific group of stocks..

 

II.  Accessing TAQ database using TAQ 2 (a windows-based program)

      TAQ2 is a windows-based data extraction program that runs on Windows 98, 98, 2000, and
       Windows NT.  The TAQ2 program allows you to filter and extract just the data you require and
       output the results in multiple data formats.

       Installing TAQ2

       1.  Open the TAQ2 zip find and save TAQ2.exe to your local or network hard drive.
       2.  Run  TAQ2.exe from the destination hard drive.
       3.  When you first run the program, it will prompt you for the location of the data file (MAS file).
            Make sure that you save your CD-ROM data in the save hard drive where TAQ2.exe is
            located. 
TAQ2 will not run if the data directory is not specisified in advance. 
       5.  Exit the program to save your configuration before running any data extractions.

       TAQ 2 Data Sources
      

       TAQ data sets can be extracted either from the TAQ CD-ROMs (UNT Library Call # CD-ROM 246)
       or from Sol.  To extract data from TAQ CD-ROMs, you simply copy all of the files in the CD-ROMs
       and paste them in your local hard drive(s). 

       You can also extract TAQ data sets from Sol using WinSCP program.  The WinSCP program is free and
       can be downloaded, here.  Double click on this WinSCP icon on your desktop once it is downloaded and
       installed:

      


      On the WinSCP Login screen, type "sol.acs.unt.edu" in the 'Host Name' box.  Then use your Sol username
      (your EUID) and password to log in. 
WinSCP can do all basic operations with files, such as copying and
      moving (to and from a remote computer).  It also allows you to rename files and folders, create new folders,
      and change properties of files and folders.
       
     

      A selectable program interfaces in WinSCP is displayed above.  A local folder is displayed in the left panel
      and a remote folder in the right panel.  For the purpose of TAQ data transfer, the left panel is your local
      hard drive (C:\ or D:\) and the right panel is the Sol file directory.  Files are transferred between these two
      panels (folders), though it is possible to transfer files into a different folder.  TAQ files are located in the
      \ <root> directory (not the "home" directory) in the right panel.  The specific directory location is:

     \ <root>  =>  export  =>  data  =>  TAQ
       
     Once you are in the TAQ directory, you should see the 1993 - 1998 TAQ data folders.  To extract these
     data sets, transfer the data periods that you are interested in to the left panel (your local hard drive) and
     then unzip the files there. 

     <Note>  Be sure to copy all of the files including the MAST, DIV, CT, and CQ files.
   
      

    
 

       
       Creating a TAQ Job File
     
     
Creating a TAQ Job File is smiple, just follow the 5 steps within the program:

        Step 1:  Select Data
   
       

     
       Job Description: This is used to describe the job in detail. It is not the file name.

       Process Options: You can choose to extract trades, quotes, statistics, or any combination thereof.

       Include Header Information: Creates a line at the top of the output file with the names of the fields.
        This is convenient if loading the data into a spreadsheet or identifying the fields while viewing the data.
        First Record Only allows the header only to be shown once for the entire output.

       Include MAST Information: Specifies if you want to include security master file information. This
       information is shown at the beginning of each security’s data within each extraction file.

       Include DIV Information: For NYSE listed stocks only. Specifies whether the dividend information
       should be included prior to each security’s data within each extraction file.

       Include Corrections: Additional records that describe the type of error a trade may have had
    
       Job Options: An Active job will be included in the “Run Jobs” option when you first start he program.
       This attribute can be modified at any time.

       From/To dates: Choose the date range you wish to extract. If the date range is outside the range of
       data contained the current TAQ CD-ROM, the program will prompt you to load the correct CD. If
       you have copied all the data to a mass storage device, the program will automatically traverse each CD
       in sequential order to extract the data.

       Time Range: Choose Selected Time in the Time Period box, then time of day you wish to extract in
       Time Range.


       Step 2.  Select Issue

     


       
        Input Type: Choose whether you want to choose securities by Ticker Symbol or by CUSIP Number.

        Input Source: Specifies whether you want to choose all the securities, selected securities as specified
        in the ticker symbol window, or whether you want to link to an ASCII file containing ticker symbols
        (one ticker symbol per line).

        Linked File: If Attached File is chosen, then specify the location and name of the tickersymbol/
        CUSIP-number input file.

        Enter Symbol: Use this field to enter symbols. Then press the Enter key or the down arrow button
        located to the right of the field after each symbol to add it to the Symbols to Process list. To find out
        what each button in this section means, hover your mouse pointer over each button for a balloon
        description.

        Saving the Ticker Symbols or CUSIP Numbers to a List File:

        After you have entered tickers into the input box, you can optionally save them to a linked List file by
        pressing the
                               
                                       

        The following window will appear:

       

        Enter filename and press Save.


        Step 3: Data Formats


     

      This step allows you to choose how your data will be formatted.  The options are self-explanatory.


       Step 4: Filter Data


    

     TAQ allows you to selectively choose the fields that you want to show in the output files.
      Additionally, you can filter your data extractions by particular stock exchanges:

      Note: Instinet (O) now clears under NASD so this code will not be in current CD’s. It
      was only valid during January and February 1993.


      Step 5: Output

      The output can go to a file or screen and you can view the output by pressing the View
      Output button. There are 4 of output types described below and ASCII can be brought
      into many applications. If you need space or are finished with previous queries you can
      check the Overwrite Existing Files box. To create output files just enter a filename in the
      space provided. By default, if a directory is not specified, the program will automatically
      save the files to the Output subdirectory of the main program directory.

     


       Output files in TXT formats:

     
TAQ 2 Trades Output.TXT

       TAQ 2 Quotes Output.TXT

       TAQ 2 Stats Output.TXT

      
       Other TAQ Issues

       1.  CUSIP-to-Ticker Symbol Translation Program (CUSIP.exe)

            This translation program reads a list of CUSIPs stored in CUSIP.IN and writes that output to CUSIP.OUT.
            Unmatched CUSIPs appear as blank entries.  The program prompts the user for the number of characters
            (up to 12) to be used in matching the symbols.  Any test editor may be used to create or modified the CUSIP.IN
            file.  This file must contain one or more CUSIPs, each listed on a seperate line.  CUSIP.OUT can be used as input
            to the selection program if blank lines for unmatched CUSIPs are removed and it is renamed or copied to SELECT.IN.

        2.  Index Reading Program (IDXREAD.exe)

             This program determines the beginining and ending locations for the ticker symbols listed in SELECT.IN for the
             range of dates specified by the user.  It is useful when debugging your own selection programs, or for ensuring
             that the CD-ROM drive is retrieving data properly.  To run the IDXREAD program, double-click the icon called
             "IDXREAD.pif."

        3.  Direct Reading Program
            
            
This program retrives quotes and/or trades directly from the binary file (s).  The use specifies the starting and
             ending locations of the symbol (s) to be retrieved.  When combined with IDXRED.exe this program is useful
             when debugging your own selection programs or for ensuring that the CD-ROM drive is retreving data properly.

             To run the DIRECT program, type the following command at the DOS prompt.  <direct>  or  while in Windows,
             double-click the icon associated with DIRECT.pif.

        4.  Master Program

             This program retrives user-selected symbols by security type and/or exchange.  The user needs to type 2 words
             next to MAST where the first is the exchange (s) and the second is the issue type (s).  The options must be
             entered as command-line arguments.  For example, entering

                        MAST  NT  CPW

             will retrieve all common (C), preferreds (P), and warrants (W) on the NYSE (N) and NASD (T).  The program
             then prompts the user for the year and month of the CD-ROM being used.

         5.  Selection Program (SELECT.exe)

              This access routine retrieves quotes and/or trades (with correction) for the ticker symbols and dates specified by the
              user.  The symbol may be fed to the program in two ways:
          
              I)  Listed in the SELECT.IN file.  Any text editor may be used to create or modify the SELECT.IN file, which
                  must contain one or more symbols, each listed on a separate line.  A sample SELECT.IN file is supplied for
                  testing the selection program when the installation is complete.

             II)  Typed as command-line arguments.  When using this option, this symbols must be typed in capital letters.
                   For example, enter

                   SELECT  WMT  PG  DIS

                   will retrieve data for the symbols WMT, PG, DIS.


           TAQ CD-ROM File Types

        The following files are included in TAQ2 CD-ROMs:

            1.  CT Binary File (Tyyyymmx.BIN):

                 The Consolidated Trade binary file, Tyyyymmx.BIN, is written in binary integer format with a fixed record length
                 of 29 bytes (without ending carriage return or line feed). The letter 'x' in the filename is the letter of the CD the
                 file resides on.

            2.  CT Index File (Tyyyymmx.IDX):
   
                 The Consolidated Trade index file, Tyyyymmx.IDX, is written in binary integer format with a fixed record length
                 of 22 bytes (without ending carriage return or line feed). The letter 'x' in the filename is the letter of the CD the file
                 resides on. The TDATE field is 4 binary bytes in the format: yyyymmdd.

           3.  CQ Binary File (Qyyyymmx.BIN):

                 The Consolidated Quote binary file, Qyyyymmx.BIN, is written in binary integer format with a fixed record length
                 of 39 bytes (without ending carriage return or line feed). The letter 'x' in the filename is the letter of the CD the file
                 resides on.

           4.  CQ Index File (Qyyyymmx.IDX):

                 The Consolidated Quote index file, Qyyyymmx.IDX, is written in binary integer format with a fixed record length
                 of 22 bytes (without ending carriage return or line feed). The letter 'x' in the filename is the letter of the CD the file
                 resides on. The QDATE field is 4 binary bytes in the format: yyyymmdd.

           5.  The Master Table (Myyyymm.tab):

                The master table contains reference information about the stocks in the trade and quote files. Some fields apply
                only to NYSE and/or AMEX issues, and are blank or zero-filled when not applicable to the issue. For those issues
                that clear through the National Securities Clearing Corp. (NSCC), there is at least one record. If an issue does not
                clear through the NSCC, a record will not appear in the master file.

           6.  The Dividend File (Dyyyymm.TAB):
           
                The dividend file, Dyyyymm.TAB, contains one record for each symbol that either paid a dividend or redistributed
                stock during the month. In rare cases, a symbol may have two records for one month. The file is written in character
                format with a fixed record length of 51 bytes (53 including carriage return and line feed).