The Berkeley Options Data Base

User's Guide

 

 

 

 

 

 

The Berkeley Options Data Base

Institute of Business and Economic Research #1922

University of California, Berkeley

Berkeley CA 94720-1922

 

 

 

 

 

 

 

 

 

Release 3.0

 

Revised August 1998

 

 

 

 

1 Introduction

The Berkeley Options Data Base (BODB) is a complete record of trading activity on the floor of the Chicago Board Options Exchange (CBOE). Derived from the CBOE's Market Data Retrieval (MDR) tapes, the database contains every bid-ask quote and every trade recorded on the floor of that exchange, time-stamped to the nearest second. The database begins on August 23, 1976 and is updated annually, usually in March or April. At the time of this printing (June 1, 1995), the database is available through December 31, 1994.

This document is intended to be a comprehensive user's guide to the Berkeley Options Data Base. It describes the database in detail, and contains valuable information on how to acquire, access, and conduct research using the database. As the database is currently designed to be accessed through a unix workstation, specific technical details are offered only for the unix platform. For use on other systems, it is necessary to find a utility that can read a unix tarfile. If you have any further questions about the database, or comments about this guide, please contact the database manager.

This guide is organized as follows. Section 2 describes the MDR file, which contains the raw data from which BODB is derived. It explains how the MDR file is created, how the tapes may be purchased, how the information is stored, and how the data are organized. Section 3 describes the BODB, how it may be purchased, how it is stored, and how the data are organized. Section 4 contains some historical information about the BODB, that may help you understand things you have read elsewhere about the database. It describes former BODB products that have been discontinued, such as the "consolidated format" and the "supplemental tape" and describes format changes in the ``resorted'' data. Section 5 is a research guide, containing pointers on how to identify ticker symbols, how stock splits are treated, and other such matters. It also describes various other data sources that may be of interest. Section 6 contains technical advice on how to read the data off the tapes and sample computer programs (in various languages) that may be used to read and manipulate the data. The appendix contains a bibliography of research papers that have employed the BODB or MDR data, a complete list of all BODB files and the calendar dates contained in each, and a list of ticker symbols.

 

 

2 The Market Data Retrieval File

The Market Data Retrieval (MDR) file is produced by the CBOE, and constitutes a complete record of bid-ask quotes and trades recorded on the floor of that exchange. Each record is time-stamped to the nearest second, and contains the contemporaneous price of the underlying security.

 

2.1 Availability

The entire MDR may be purchased from the CBOE, by the month, on 6250 bpi magnetic tapes. As of January, 1995, the rate was $500 per month of data, with a 10% discount on orders of six months or more, and a 15% discount on orders of one year or more. The MDR data for S&P 100 (OEX) index options may be purchased separately at a price of $450 for three years of data. Data for S&P 500 (SPX) index options may be purchased at the same rate. Other individual securities from the MDR may be purchased at the rate of $100 per month for the first security and $30 per month for each additional security.

The CBOE also offers various summary files. The ``Option Summary'' file contains daily high, low, and closing prices, trading volume and open interest, since October 1, 1985. One year of the Option Summary file for up to ten securities may be purchased for $125 on 6250 bpi tapes or $100 on floppy disk or hardcopy. The ``Expanded Option Summary'' file contains all the information in the Option Summary file plus the underlying stock price. One year of the Expanded Option Summary file for up to ten securities may be purchased for $150 on 6250 bpi tapes or $125 on floppy disk or hardcopy. The ``Index Summary'' file contains daily information for the various indexes on which options are traded at the CBOE, including the daily high, low, and closing index value, the change from the previous close, total trading volume on calls, total trading volume on puts, total open interest on calls, and total open interest on puts. The Index Summary file may be purchased at the rate of $25 per index on floppy disk, in which case data are available beginning October 1, 1985, or $15 per index hardcopy, in which case the data are available beginning March 11, 1983.

The ``Volume Summary'' file contains a daily observations of total calls volume, total puts volume, total calls open interest, and total puts open interest for any option class trading on the CBOE. One year of data on up to ten securities may be purchased on hardcopy or floppy disk for $75. Finally, the ``Total CBOE Volume Summary'' file contains daily observations of total call volume and total put volume on the CBOE. The entire file, which begins in January, 1978, may be purchased on floppy disk for $25 or hardcopy for $15. For ordering information, contact the CBOE data sales hotline at (312) 786-7426. To obtain information on contract specifications and ticker symbols, call the CBOE marketing department at (312) 786-7434.

 

 

2.2 Data Entry

The MDR contains four main types of records: trade records, quote records, cancel records, and underlying records. Quote records contain bid and ask prices, while trade records contain transaction price and volume. Cancel records, as the name indicates, cancel previous records on the same underlying contract. Trade, quote, and cancel records are all time-stamped, and contain a contemporaneous observation of the underlying stock price. Underlying records contain information about the underlying stock that is recorded on the MDR without a trade, quote, or cancel having occurred.

Some quote records are recorded on the floor of the exchange by a Quote Reporting Terminal Operator, who enters bid-ask quotes as they are shouted in the trading crowd. The reporting lag for quotes should be very short, only as long as is required for the terminal operator to enter the option identification and the quote, which should be less than five seconds. In addition, many options are now quoted through the ``autoquote'' system. In this case, a market maker chooses the input parameters for a Black-Scholes or Cox-Ross-Rubinstein pricing model, and bid-ask quotes are automatically updated by computer whenever the underlying stock price changes. The autoquote system has led to a large increase in the volume of data recorded on the MDR over the last few years. A large portion of the recent MDR data is made up of quotes on index options, where the underlying index is recalculated every 15 seconds, and the autoquote system continually spits out fresh quotes.

Trade records are recorded by a Price Reporting Terminal Operator. The reporting delay for trades may be considerably longer than for quotes. After a verbal agreement to a trade has been consummated between two members of the trading crowd, the seller writes up the trade on a blank ticket he is carrying, and deposits a copy of the sell ticket on a conveyor belt at the post. This process generally takes from 5 to 40 seconds, depending on the number of traders involved, how fast they write, and how far they are from the conveyor belt. When trading is particularly active, traders might hold onto these tickets for up to several minutes before depositing them on the conveyor belt. Upon receiving the ticket, the Terminal Operator immediately removes the ticket from the bin, with a single key stroke simultaneously enters the stock symbol, expiration month, and strike price, and then separately enters the number of contracts traded, the transaction price, and the identifying symbols of the buying and selling floor traders.

The computer completes the record by automatically registering the time of day and the most recent transaction price of the underlying stock. One terminal operator handles call options and another puts, at separate terminals. In special circumstances, the Terminal Operator will also enter a ``transaction prefix,'' indicating, for example, that the trade is known to be part of a spread order, or is known to be out of sequence. Because trades take longer to record than quotes, great care should be taken in interpreting the time sequence reported in the MDR or Berkeley Options Data Base.

 

 

2.3 Storage Format

The MDR may be acquired from the CBOE on standard or non-labelled 6250 bpi magnetic tapes. The Berkeley Options Data Base receives the MDR on non-labelled tapes, on which the MDR is stored as a large fixed-length EBCDIC file, on multiple tapes, with record length 61 and blocksize 32757. Because the file is stored as a multiple-tape file, each MDR tape is crammed full of data. On a unix system, with 9-track tape drive designated /dev/rst1, the data may be transferred to hard disk using the command:

 

dd if=/dev/rst1 of=mdrdata bs=32757 cbs=61 conv=ascii

The if= option specifies the input file or device, the of= option specifies the output file, bs designates the blocksize, conv instructs the program to convert the data (in this case from EBCDIC to ASCII), and cbs (conversion buffer size) tells the program to split the file into lines as it converts it. Note that the cbs option only works when the conv option is specified.

This will create a fixed-length file of around 257,000,000 bytes. This number is slightly larger than the capacity of a 9-track tape because the conversion program adds end-of-line characters.

2.4 Sorting Order

Within the MDR, records are sorted first by date, then by underlying security. Within each underlying security calls are listed before puts. The calls (puts) are sorted in order of expiration month, within each expiration month by strike price, and within each strike price the records are listed chronologically. In summary, records are sorted according to the scheme

DATE : UNDERLYING ASSET : CALL/PUT : EXPIRATION : STRIKE : TIME

 

 

2.5 Record Layout

The MDR file contains fixed-length records with the following (undelimited) fields:

Description Field Type Length
Trade Date

Option Class

Expiration Month Symbol

Strike Price Symbol

Trade Price Integer

Trade Price Fraction

Volume

Bid Price Integer

Bid Price Fraction

Ask Price Integer

Ask Price Fraction

Stock Price Integer

Stock Price Fraction

Extra Space

Security Type Symbol

Record ID

Prefix Code

Put/Call Code

Expiration Month

Strike Price

Numeric

Alpha

Alpha

Alpha

Numeric

Numeric

Numeric

Numeric

Numeric

Numeric

Numeric

Numeric

Numeric

Numeric

Numeric

Alpha

Alpha

Alpha

Alpha

6

3

1

1

3

3

5

3

3

3

3

5

1

1

1

2

4

1

3

3

 

 

Here are a few sample lines from the MDR file:

920922GPSOG142911000000000000040000040040003320102 PMAR035

920922GPSOG143914000000000000040000040120003310102 PMAR035

920922GPSOG144049004004000050000000000000003310101 PMAR035

920922GPSOH083237000000000000070120070240003330102 PMAR040

920922GPSOH083602000000000000070160070280003330102 PMAR040

920922GPSOH083651000000000000070120070240003330102 PMAR040

920922GPSOH084614000000000000070080070200003340102 PMAR040

920922GPSOH085117000000000000070120070240003350102 PMAR040

920922GPSOH085611000000000000070080070200003350102 PMAR040

 

2.5.1 Date and Time

These transactions were recorded on September 22, 1992. This can be ascertained from the first six characters of each line [1-6], which contain the date in the form YYMMDD.

Characters [12-17] contain the timestamp, recorded in Central standard time, in the form HHMMSS. Thus, the third record was recorded at 2:40:49 P.M., and the fourth record was recorded at 8:32:37 A.M.

2.5.2 Ticker Symbol

Between the date and time are five characters [7-11] containing a ticker symbol that completely identifies the option contract.Note that only the first three of these characters appear in the Berkeley Options Data Base.

The first three letters are unique to the underlying asset. One or two of these characters may be blank. In the example above, the letters GPS signify that these are equity options on Gap Stores stock. For options on stocks traded on the New York Stock Exchange or the American Stock exchange, these three letters will generally be the same as the underlying stock market ticker symbol. For options on NASDAQ stocks, the option ticker usually contains two letters from the NASDAQ ticker (not necessarily the first two), plus the letter Q. So, for example, NASDAQ ticker symbol AAPL might become AAQ. Other ticker symbols indicate index options, LEAPS, interest rate options, or other types of options. For more details, see the section on ticker symbols, below.

The fourth letter of the ticker symbol identifies the contract type and expiration date. The letters A through L designate call options expiring in January (A) through December (L). The letters M through X represent put options expiring in January (M) through December (X). In this example, the letter O designates a March put option.

The fifth letter identifies the last two digits in the option's strike price. The letter A signifies a strike price ending in 05, B is a strike ending in 10, and so on. In the example above, the first three records are for options with a strike price ending in 35 (G), and the last six are for options with a strike price ending in 40 (H).

The five-letter ticker symbols provide a convenient way to sort the data. In fact, the MDR file is sorted, within each calendar day, according to the five-letter ticker symbol. This means that, as described above, the data are sorted by underlying firm, and within each firm the calls are separated from the puts, calls and puts are each sorted by expiration month, and within each expiration month in order of strike price. The information contained in the fourth and fifth letters of the ticker symbol is duplicated in the last seven characters of each line [55-61], where it is presented in a more intuitive format. In this case, the last seven characters are PMAR035 or PMAR040, indicating the March 35 and March 40 puts.

2.5.3 Record Type and Prefix Code

Immediately to the left of these last seven characters is a six-character field [49-54] including a numeric record type,(indicating whether it is a trade, quote, cancel, or underlying record) and on occasion a four-letter ``Prefix'' code, indicating additional information about the record. The four record types are:

 

 

Number Type
01 Trade
02 Quote
03 Cancel
04 Underlying

 

 

and the prefix codes are:

Code Meaning
ROTA Opening Rotation
ENDR End of Opening Rotation
AUTO Start of RAES, the Electronic Execution System
RAES Transaction was Executed Electronically
ENDA End of RAES
OPEN Opening Trade, Recorded Late, Out of Sequence
OPNL Opening Trade, Recorded Late, In Sequence
OSEQ Recorded Late, Out of Sequence
LATE Recorded Late, In Sequence
SPRD Record is Part of a Combination Trade
STDL Record is Part of a Straddle
FAST Recorded under Fast Trading Conditions
HALT Reopen After Trading Halt
CLOS Closing Record
CNCO Cancel the Opening Trade
CANC Cancel the Last Trade, if it is not the Opening Trade
CNCL Cancel Another Trade, not the Last or Opening Trade
CNOL Cancel the Only Trade of the Day

 

 

In the sample above, the third record is a trade record, and the others are all quote records.

2.5.4 Trade Price and Contract Volume

For trade records, the transaction price is recorded at [18-23], with the integer portion of the price recorded in the first three bytes [18-20] and the fractional part, recorded in thirty-seconds of a dollar, in the next three

[21-23]. For example, in the third record in the above sample, the price is recorded as 004004, which translates to $4 1/8. Contract volume is recorded in the field [23-28], which in this case is 00005, or 5 contracts. Both the trade price field and the contract volume field will be empty for quote records.

2.5.5 Bid and Ask Prices

For quote records, the bid price is recorded in the field [29-34] and the ask price is recorded in the field [35-40]. As for trade prices, bid and ask prices are recorded as a three-digit integer followed by a three-digit fraction, denominated in thirty-seconds of a dollar. The first record in the sample above has a bid price of $4 (004000) and ask price of $4 1/8 (004004). The last record has a bid of $7 1/4 (007008) and an ask of $7 5/8 (007020). The bid and ask fields will be empty for trade records.

2.5.6 Underlying Asset Price

The underlying asset price is recorded at locations [41-46]. For equity options, the integer portion of the stock price is contained in the first five bytes, and the sixth byte contains the fractional portion of the price, denominated in eighths of a dollar. Thus, the first record in the sample data above was recorded when the current stock price for Gap Stores was $33 1/4 (000332), and the second was recorded when the underlying was at $33 1/8 (000331). For index options, the underlying value is usually recorded with the hundreds digit left-justified in this field, so that an OEX value of 455.31 would be recorded as 451310. The MDR database is not always consistent in the way index options are recorded.

2.5.7 Option Type Identifier

In byte [48] of the MDR record is a number that identifies the type of underlying security, according to the following table:

 

Number Security Type
1 EQTY
2 GNMA
3 TBND
4 EQTY Group
5 EQTY Index
6 INDEX
7 FCO

 

 

 

3 The Berkeley Options Data Base

The Berkeley Options Data Base (BODB) is associated with the Institute of Business and Economic Research at the University of California, Berkeley. Through a contractual arrangement with the CBOE, BODB offers a reprocessed

version of the Market Data Report, beginning August 23, 1976. The database is updated annually, and is currently available through December 31, 1994. The database is managed by a graduate student in the finance group at the Haas School of Business, under the direction of professor Mark Rubinstein. At the time of this printing, (August 17, 1998) the current database manager is Kehong Wen. You may contact BODB via e-mail to options@haas.berkeley.edu, by phone at (510) 643-8893, by fax at (510) 642-5018, or by mail to the Berkeley Options Data Base, Institute of Business and Economic Research MC #1922,

Berkeley, CA 94720-1922.

3.1 Availability

The Berkeley Options Data Base is available on 8mm magnetic tapes, sometimes known as ``Exabyte'' tapes. For data prior to 1990, one year of data are stored on a single 8mm tape. Beginning with 1990, six months of data are stored on a single tape. To access the data, it is necessary to use the unix ``tar'' (tape archive) utility. In addition, at least 200 megabytes of hard disk space are required. Normally, the tapes are written using an Exabyte 8500 drive, and cannot be read by the (lower-density) Exabyte 8200 drive. However, the tapes may be written at low-density through special arrangement with the database manager.

Due to the increasing volume of data, data beginning in January 1994, will be compressed using the {bf gzip} compression facility.

Plans are underway to also offer the database on 4mm ``dat'' tapes and eventually, CD ROM. Through special arrangement, the data may also be purchased through the internet via ftp. The database is no longer available on 6250 bpi 9-track tapes.

The Berkeley Options Data Base may be purchased at the rate of $200 per month of historical data, with a minimum purchase of six months. In addition, there is a processing charge of $80 per 8mm tape.

Customers in California must also pay sales tax. Academic institutions may acquire the data at a special rate of $150 per month. Academic institutions qualify for a volume discount rate of $120 per month if they have cumulatively purchased 36 months of data. BODB does not currently offer subsets of the database. If you want more than one copy of the tape, for example if one of your tapes is lost or damaged, there is a processing charge of $80 per tape.

To purchase the data, it is necessary to sign a ``subscription agreement'' contract. If your institution already has a contract on file at the BODB from a previous purchase, it is not necessary to sign another contract. To obtain a contract and order form, contact the database manager.

3.2 Storage Format

The BODB is stored in fixed-length ASCII files, archived in unix tarfiles on 8mm tapes. To restore a file onto hard disk on a unix system where the 8mm tape drive is designated /dev/rst5, place the tape in the drive, make sure there is more than 161 MB of disk space in the current partition, and issue the command

tar xvf /dev/rst5 filename

where filename is the name of the file you wish to restore. To list all the files on a tape, issue the command

tar tvf /dev/rst5

This might take as long as two hours. If the filenames end with a .gz suffix, they have been compressed, and must be uncompressed using the publicly available program gunzip.

Prior to December, 1993, the filenames on the tape are of the form resXXX, and are the same as the names of the 6250 bpi tapes on which the data were previously stored. The first file (from 1976) is res01, the December, 1993 file is res227, and the file numbers in between are consecutive. The only anomalies are the files for August, September, and October, 1987 (the first three months that originally required more than one 6250 bpi tape), which are named res98A and res98B, res99A and res99B, res100A and res100B. For a complete listing of filenames, see Appendix B. Beginning in January, 1994, filenames are of the form resYYXXX, so the first file in 1994 is res94001, and the last is res94046.

 

3.3 Sorting Order

The Berkeley Options Data Base is currently available only in the ``resorted'' format. It is called the resorted format because it is little more than a resorted version of the MDR. The processing program alters certain fields of the MDR records to make them easier to interpret, performs a few screens for bad or duplicate records, and changes the sorting order. While the MDR is sorted according to five-letter ticker symbol, BODB is sorted according to three-letter ticker symbol. This means that in the BODB, all the day's records on the same underlying stock are ordered chronologically, regardless of expiration or strike, unlike the MDR, where records are further sorted by option contract.

In summary, the BODB is sorted according to the scheme

DATE : UNDERLYING ASSET : TIME

Beginning in January, 1994, records occuring in the same second are further sorted according to record type and option contract. The new sorting scheme is:

DATE : UNDERLYING ASSET : TIME : RECORD TYPE : EXPIRATION : PUT/CALL : STRIKE

3.4 Record Layout

The resorted data are contained in a fixed-length file with the following

(undelimited) fields:

 

Description Field Type Length
Record Type Numeric 2
Ticker Symbol Alpha 3
Date Numeric 6
Time Numeric 6
Expiration Month Numeric 2
Strike Price Numeric 6
Bid Price or Trade Price Numeric 5
Ask Price or Volume Numeric 5
Underlying Asset Price Numeric 5

 

 

Here are a few sample records from the Berkeley Options Data Base:

43IBM930104084001 1-08500033500337505140

2IBM930104084014 2 04500007500080005140

1IBM930104084021 1-05500003880000305140

2IBM930104084021 2-04500001250013105140

2IBM930104084034 2 05000004000042505140

1IBM930104084038 2 05000004000000405140

2IBM930104084040 2-05000002880031305140

1IBM930104084044 1 05500000630004105140

2IBM930104084052 2 05500001750018105140

1IBM930104084053 1 05500000630001005140

 

 

3.4.1 Record Type

The first field [1-2] determines whether the record is a trade or a quote, and also incorporates the information in the MDR prefix code. The MDR data type and prefix codes are translated into BODB record types according to the following table:

 

MDR Code BODB Record Type
04HALT

01

02

01SPRD

01STDL

02HALT

01LATE

01OSEQ

01OPEN

01OPNL

03LATE

03OSEQ

03OPEN

03SPRD

03STDL

03CNCO

03CNCL

03CANC

03CNOL

03OPNL

04AUTO

01RAES

04END

02ROTA

04ROTA

04ENDR

02AUTO

04END

04ENDF

04FAST

01FAST

02FAST

01CLOS

02CLOS

04CLOS

03REOP

02ZZZZ

0

1

2

3

4

5

6

7

8

9

20

21

22

23

24

25

26

27

28

29

40

41

42

43

44

45

46

47

48

60

61

62

63

64

65

66

67

 

 

 

3.4.2 Ticker Symbol

In the Berkeley Options Data Base, the underlying ticker symbol, which is located in field [3-5], is copied directly from the first three characters of the MDR ticker symbol. As mentioned above, this ticker usually corresponds to

the stock exchange ticker symbol for NYSE and AMEX stocks, but not for NASDAQ stocks. More details on ticker symbols are given in the ``Research Guide''

section of this document, and a list of CBOE ticker symbols is contained in Appendix C.

3.4.3 Date and Time

The BODB date and time fields are copied exactly from the corresponding fields in the MDR. Thus, the date and time, recorded in characters [6-17], are given in the form YYMMDDHHMMSS.

3.4.4 Expiration Month

Instead of using the CBOE's alphabetic expiration codes, the Berkeley Options Data Base reports the expiration month in numeric form in characters [18-19].

The records in the above sample are all for options expiring in January ([space]1) or February ([space]2). Details on how to determine exact expiration dates is contained in the ``research guide'' section below.

3.4.5 Strike Price and Call/Put Indicator

The BODB contains the strike price, denominated in cents, in location [21-25]. Puts are indicated by a negative sign in location [20]. Thus, the first record above is for a (January) 85 put, and the last record is for a (January 55 call).

3.4.6 Trade Prices, Contract Volume, Bid and Ask Prices

Because the MDR trade fields are always empty for quote records, and its quote fields are empty for trade records, the BODB is able to save space by recording both fields in the same location. For trade records, the price is recorded

at [26-30] and contract volume is recorded at [31-35]. For quote records, the bid price is recorded at [26-30] and the ask price is recorded at [31-35].

Instead of recording prices in thirty-seconds of a dollar, the BODB converts the fractional portion of the MDR bid, ask, and transaction prices to pennies, rounding off according to the following rule:

Thirty-Seconds Cents Thirty-Seconds Cents Thirty-Seconds Cents
001

002

003

004

005

006

007

008

009

010

011

03

06

09

13

16

19

22

25

28

31

34

012

013

014

015

016

017

018

019

020

021

022

38

41

44

47

50

53

56

59

63

66

69

023

024

025

026

027

028

029

030

031

000

72

75

78

81

84

88

91

94

97

00

 

 

 

The second record in the above sample is a quote record on a February 45 call, with bid price of $7 1/2 (00750) and ask price of $8 (00800). The third is a trade record on a January 55 put, where at a price of 3 7/8 (00388), three contracts (00003) were traded.

3.4.7 Underlying Asset Price

For most ticker symbols, the Berkeley Options Data Base contains an exact copy of the last four digits of the MDR's underlying price field, plus an additional zero as a place-holder. Thus, the price of IBM in the above records

was $51 1/2 (05140). You may notice that some of the previous BODB documentation

describes this field as being recorded in dollars and cents. This is no longer correct--beginning with the January, 1986 data, the field is recorded in dollars and eighths for equity options.

For index options such as the OEX and SPX series, the BODB contains an exact copy of the first five digits of the MDR's underlying price field, so that an OEX value of 455.31 would be recorded as 45531. Not all underlying values for index options are recorded correctly in release 2.01 of BODB (January 1986--December 1993). In some cases, such as the SPZ overflow ticker, the version 2.01

processing program mistakenly treated the MDR record as an equity option. Consequently, the BODB contains only the last three digits of the underlying price. An SPZ value of 415.65, for example, was mistakenly recorded as 56500.

This problem has been corrected in the new version 3.0 program, so data beginning in January, 1994 are fine.

 

4 Historical Information

A brief account of the origin of the Berkeley Options Data Base is described in a paper by Rubinstein and Vijh (see the Bibliography, below). The Berkeley Options Data Base was created by Mark Rubinstein and others at the University of California, Berkeley, in cooperation with the Chicago Board Options Exchange, and with a grant from the National Science Foundation. Several other individuals have helped create, develop, or maintain BODB over the years, including Anand Vijh, Mihir Bhattacharaya, Mark Garman, Robert Geske, Rachid Laraqui, Frederic Sipiere, Richard Lindsay, Gail Belonsky, Rakesh Chandra, Stewart Mayhew, Kehong Wen, and Xiaoyan Ma. The Berkeley Options Data Base operates under the auspices of the Institute for Business and Economic Research at the University of California, Berkeley.

4.1 Versions

As changes occur in the MDR tapes, or as errors are discovered in the data, new versions of the BODB are sometimes released. BODB does not guarantee the accuracy of the data, and has no responsibility to replace old data when a new version is released. However, under our current policy we will replace data for

anyone who has purchased our data, at any time, and for any reason. The replacement charge is $80 per 8mm tape. Unfortunately, we can no longer sent replacement data on 9-track tapes.

Version numbers for the BODB correspond to the processing program used to convert MDR data to BODB format. It is costly and difficult for us to go back and process old MDR data with a new processing program. Consequently, different periods of the database were processed using different programs, and there are slight format differences. The current processing program, version 3.0 is written in PERL and runs under a sunOS implementation of unix system V release 4. Currrently, the version 3.0 release begins in January, 1994. Unlike previous versions, version 3 does not alter record types to indicate an underlying stock split.

Moving to the unix platform involved a lot of new programming, and we are still debugging the processing program. We are aware of one error in version 3.0: there are a few extra unidentified control records in the MDR data, which should have been discarded but were included in the BODB with no record type or with type 4. These records contain no useful information and should be disregarded. If you find any other problems with the data, please contact the database manager.

The processing program for Version 2.1, which is the current release from January 1986 through December 1993, was written in FORTRAN and REXX, and operated under VM/CMS. This and previous versions altered the record type for options whose specifications had been modified because of an underlying stock split. The procedure for affecting this modification was imperfect, however,

and not all splits were correctly specified. Rather than continuing to provide faulty split information, we elected to discontinue this practice. To obtain correct information on stock splits, you must contact the exchange and obtain the memo corresponding to the split. For more information, see the section on stock splits, below. Another problem with version 2.1 was its incorrect treatment of underlying prices for index options, descibed in the ``underlying asset'' section above. Plans are underway to reprocess this data under version 3.

 

Version 2.0, which covers December 1979 through December 1985, differs from version 2.1 in that (1) the usused portion of each numeric field is filled

with blank spaces instead of zeros, and (2) underlying stock prices are recorded in dollars and cents instead of dollars and eighths. Starting September 30, 1985, the market began opening at 8:30 instead of 9:00. In version 2.0, time stamps for the first half hour of trading were incorrect: transactions occuring X seconds before 9:00 A.M. were mistakenly recorded as having occurred X seconds

after 9:00 A.M. This problem only exists from September 30, 1985 through

December 31, 1985.

Version 1, which covers August, 1976 through November, 1979, has a slightly different format, with one fewer characters in the strike price and underlying asset fields.

4.2 The Consolidated Format

Until 1987, BODB offered a condensed version of the MDR file, called the consolidated format. Instead of reporting each individual record, this format summarized all trades and quotes occurring within each block of time during which the underlying asset price did not change. Each record contained the date, ticker, strike price, and time to expiration, a beginning and ending time, the underlying stock price, high and low quoted prices, a summary of all trades during the period, information on the stock prices preceding and following the record and the approximate elapsed time between these price changes, and the number of original transaction records from which the consolidated record was created. The consolidated format was discontinued due to lack of interest.

All the information in the consolidated format may be derived from

the resorted format.

4.3 The Supplemental Tape

BODB formerly offered a ``supplemental'' tape which included several daily interest rate series, dividend information on all CBOE underlying stocks, and daily closing levels of stock market indexes. This service was discontinued due to high maintenance costs. Dividend information is available from CRSP. Interest rates may be obtained from the Wall Street Journal, or else implied interest

rates may be calculated from the prices of S&P 500 index options using the put-call parity relationship. Daily index information may be purchased from the CBOE.

 

 

5. Research Guide

5.1 Why Use Transactions Data?

Transactions data is ideal for empirically testing market-microstructure models, and extremely useful for any type of research investigating bid-ask spreads, order flow, trading volume, price discovery and the lead-lag relationship between options and their underlying stocks, price discreteness, or intraday dynamics. It is also very useful in more traditional asset pricing, in that it can be used to measure the biases introduced by microstructural frictions.

In options markets, it is particularly important to recognize the severe problems associated with asynchronous data. Since option prices are so sensitive to the underlying asset price, unobserved intraday movements in the underlying price will render asynchronously recorded closing prices incomparable. In fact, it is not uncommon to observe apparent static arbitrage opportunities among reported closing prices. Results of studies based on closing option prices without correcting for changes in the contemporaneous stock price are often viewed as highly suspicious, and are unlikely to be accepted for publication.

Thus, if your research calls for daily observations of option prices, it is strongly recommended that you carefully construct your daily observations from transactions data.

5.2 Ticker Symbols

Appendix C of this document contains information to help match options to their underlying security. However, the list is constantly changing, and you may wish to update it yourself. This section describes how to obtain information on ticker symbols, and describes conventions used by the exchanges in choosing ticker symbols.

5.2.1 Obtaining Ticker Symbol Information

The best way to identify option ticker symbols is to contact the options clearing corporation and obtain their pamphlet ``Directory of Exchange Listed Options.'' This pamphlet contains the ticker symbols for options traded

on the CBOE, as well as those trading on the New York, American, Philadelphia, and Pacific stock exchanges. The pamphlet may also be obtained through the CBOE, and is also available, for a fee, in electronic form.

One drawback is that this pamphlet contains only currently traded options. For historical research, there is another list, available from the CBOE, that contains listing and delisting dates of equity options, along with some name change information that can be helpful for identifying tickers. Appendix C contains a subset of this list, containing only ticker symbols, listing and delisting dates, and company names for options trading on the CBOE. Appendix C also contains lists of ticker symbols for other options trading on the CBOE, including index options, index LEAPS, interest rate options, and equity LEAPS.

5.2.2 Ticker Symbol Conventions

The New York and American Stock Exchanges use three-letter ticker symbols to identify securities. Options on stocks traded on these exchanges generally use the same three-letter symbol. NASDAQ ticker symbols are longer than three letters, and for these stocks, options are traded using a ticker symbol consisting of two letters from the NASDAQ ticker plus the letter Q.

Index options are normally given a three letter ticker symbol ending in X, with the first two letters chosen to describe the underlying security. For instance, the ticker symbol for the S&P 500 index is SPX, And the ticker symbol for the S&P 100 index, originally known as the CBOE 100 index, is OEX.

Because options are traded using a single letter to designate the option's strike price, there is a limit to the number of different strike prices that may be assigned to a single ticker symbol. For the most popular option contracts, such as the SPX and OEX indexes, it became necessary to assign a second ticker symbol to accommodate the wide range of strike prices. Secondary ticker symbols are usually created by changing the last letter of the ticker symbol to Z. Ticker SPZ serves as the overflow symbol for SPX, and OEZ is the overflow for OEX. Sometimes, additional option classes are added that are similar to an existing class, and these are assigned ticker symbols that are as descriptive as possible. For example, options on the SPX index that expire at the end of each

quarter rather than the customary dates were given the ticker symbol SPQ.

Options with long-term expiration dates, known as LEAPS, are also assigned their own ticker symbols, which resemble the original tickers. LEAPS are traded with one expiration date per year, either in December or January depending on whether they are Equity or Index LEAPS. Generally a letter is chosen to designate all LEAPS expiring in the same year, such as ``V'' for 1995 or ``L'' for 1996,

and each LEAPS ticker symbol is created, if possible, by substituting this letter for one of the letters in the original ticker. For example, the IBM 1995 LEAPS are traded under the ticker symbol VIB, and IBM 1996 LEAPS under the symbol LIB. Sometimes the letters have to be twisted around to avoid duplicating

another ticker symbol. Ticker symbols for some Index and Equity LEAPS are listed in Appendix C. You may wish to contact the exchange for a more current list.

 

5.3 Stock Splits

Another reason for introducing new ticker symbols is to differentiate between options on pre- and post-split shares. Because exchange-traded options are protected against splits, the terms of the existing option contracts must be adjusted whenever a stock splits. A call option contract with a strike price of 80 gives its owner the right to buy 100 shares of the underlying stock for 80 dollars a share. If the stock then splits 2-1, the contract is adjusted so that the owner now holds a call on 200 shares with a strike price of 40. After the split, however, new options will also be written with a strike price of 40, but the newly-written options are on 100 shares. Under the standard ticker-symbol nomenclature, traders would be unable to distinguish between options written on 200 shares and those written on 100 shares. To solve this problem, the exchange introduces a secondary ticker symbol whenever there is a stock split. The split-adjusted options are assigned a new ticker symbol, usually constructed from two letters of the original ticker symbol plus the letter Z. If IBM were to split, the old options might begin trading under the ticker symbol IBZ, and newly-written options will trade under the usual ticker symbol. In addition, traders must remember that the strike prices on the old options will be spaced at unconventional strike-price intervals. After a 2-for-1 split, the old options will trade at two-and-a-half dollar increments. To avoid confusion, the exchange prepares a special memorandum whenever a stock splits, specifying exactly the 5-letter trading code for each old option, and each new option.

5.4 Option Expiration

Nearly all options expire on the Saturday following the third Friday of the expiration month. One exception are the S&P 500 end-of-quarter options (SPQ). Equity options are listed for the nearest two exercise dates, plus the next two

expiration dates in the option's expiration ``cycle.''

 

Cycle Expiration Months
January

February

March

JAN, APR, JUL, OCT

FEB, MAY, AUG, NOV

MAR, JUN, SEP, DEC

 

For example, at the beginning of April, a January-cycle stock will have options expiring in April, May, July, and October, a February-cycle stock will have options expiring in April, May, August, and November, and a March-cycle stock will have options expiring in April, May, Jun, and September. On the Saturday following the third Friday of April, the April options expire, and a new set of contracts is introduced (June contracts are opened for January-cycle and February-cycle stocks, and December contracts are opened for March-cycle stocks).

Some index options, such as SPX, are assigned to expiration cycles like equity options. Others, such as the OEX, are always traded for the closest four expiration dates. In addition, long-term options with maturities of up to

three years are traded on popular securities. A list of index and equity LEAPS is contained in Appendix C. These options expire once a year, in December or January.

5.5 Other Data Sources

Organized exchanges are usually required to keep a permanent record of all their transactions. Many of these exchanges have made their transactions data available to the public, or at least to academics for research purposes.

In general, the only way to acquire this data is directly from the exchange, usually on 9-track tapes.

 

 

5.5.1 Stock Market Transactions Data

Prior to January, 1993, transactions data for stocks traded in the United States were available on 9-track magnetic tapes from the Institute for the Study of Securities Markets (ISSM). Beginning in January 1993, the New York Stock Exchange instituted the ``Trades And Quotes'' (TAQ) database, containing trades and quotes for all stocks on the New York and American Stock Exchanges, as well as the Regional exchanges and NASDAQ. The TAQ database comes on PC-readable CD-ROM, with built-in access programs that run under DOS. The FORTRAN source code is included for these access programs. The data may be purchased from the New York Stock Exchange for $200/month, with each month stored on a separate CD.

In addition to the TAQ, the New York Stock Exchange has more extensive databases which are not currently available to the public, but which are often used by researchers at the exchange. A small sample of this data, known as TORQ for ``Trades, Orders, Reports, and Quotes,'' was publicly released on a single CD-ROM. This CD contains a few months of data for a small number of firms. In addition to the trade and quote data available on TAQ, this database contains the so-called ``audit trail'' data, a rich source of information about the institutional structure of order flow in the stock market.

5.5.2 Options on Futures

Options on futures are traded primarily at the Chicago Mercantile Exchange and the Chicago Board of Trade. Transactions data may be obtained from these exchanges on 9-track tapes. Especially of interest are their options on S&P index futures, bond futures, Eurodollar futures, and currency futures.

5.5.3 Currency Options

The Berkeley Options Data Base contains some data on currency options, which were traded on the CBOE during part of 1985 and 1986. European-style options were traded on the Japanese Yen, Deutchemark, British Pound, Swiss Franc, and Canadian Dollar. Due to the low trading volume on these contracts, the CBOE sold them to the Philadelphia Stock Exchange (PHLX), where they are still traded.

American-style currency options are also traded at the PHLX. For a charge of $75, academic researchers may purchase either a daily summary database on 30 floppy disks or a transactions database on one 9-track magnetic tape. The transactions database begins in 1984, and contains only trades, not quotes.

The PHLX stores their quote data on microfiche, and it is inaccessible for technical reasons.

5.5.4 How to Contact the Exchanges

American Stock Exchange

Derivative Securities

86 Trinity Place

New York, NY 10006

1-800-THE-AMEX

Chicago Board Options Exchange

LaSalle at Van Buren

Chicago, IL 60605

1-800-OPTIONS

(312) 786-5600

New York Stock Exchange

Options and Index Products

11 Wall Street

New York, NY 10005

1-800-692-6973

(212) 656-8533

The Options Clearing Corporation

440 South LaSalle Street

Suite 2400

Chicago, IL 60605

1-800-537-4258

(312) 322-6200

Pacific Stock Exchange

Options Marketing

115 Sansome Street, 7th Floor

San Francisco, CA 94104

1-800-TALK-PSE

(415) 393-4028

Philadelphia Stock Exchange

1900 Market Street

Philadelphia, PA 19103

1-800-THE-PHLX

(215) 496-5404

Chicago Board of Trade

Market Data Services [Larsenia R. Williams]

141 W. Jackson, Ste. 2313

Chicago, IL 60604-2994.

(312) 341-3163

Chicago Mercantile Exchange

Records Management [Andr'e Gibson]

30 South Wacker Drive

Chicago, IL 60606-7499

(312) 930-3178

 

 

6 Computer Programs

6.1 Extracting a List of Ticker Symbols

If you are using a unix system, it is very easy to extract a list of ticker symbols contained in a BODB file. To create a new file ticklist from BODB file res227, containing an alphabetical list of all ticker symbols in that file,

simply type the following at the unix command line:

cut -c 3-5 res227 | sort -u -o ticklist &

This command ``cuts'' out columns 3 through 5 from the file res227, and sends the result to a sorting program. The -u option instructs the sorting program to throw away duplicate observations, and -o ticklist specifies the name of the output file. The asterisk makes the program run in the background. You could use the same type of command to extract a list of record types [1-2], expiration dates [18-19], strike prices [20-25], or any other field in the data.

6.2 Extracting Columns

Suppose you are not interested in all the data fields, but only wish to extract a few fields. You can accomplish this from the unix command line using the cut and paste utilities. For example, suppose that you only want the ticker symbol, date, time, and stock price. Because the ticker, date, and time are adjacent, you can cut them out and place them in a separate file with a single command,

cut -c 3-17 res227 > firsthalf

then cut out the stock prices with a similar command,

cut -c 35-40 res227 > secondhalf

and glue them back together in a single file:

paste firsthalf secondhalf > outfile

 

6.3 Extracting Records

The simplest way to extract records from a BODB file is using a computer program that reads the file one line at a time, checks to see whether the line meets the selection criteria, and if so, writes the line to an output file. Simple jobs, such as extracting all records for a single ticker symbol, can easily be handled using built-in unix utilities such as grep. For example, to extract all OEX records from BODB file res193 and place them in a new file called oex193,

simply type

grep OEX res193 > oex193

at the unix command line. When extracting one- or two-letter ticker symbols

using grep, you may accidently extract unwanted symbols. For example, if you were to try grep T res193, you would not only extract ticker symbol T but also every other ticker symbol containing the letter T. To get around this problem, you must tell the grep program to look for the string T preceded by a number from 0 to 9 and followed by a space. How you specify this may depend on the syntax for regular expressions in your unix shell. In the sh shell, issue the command

grep [0-9]T[verb+ +] > t193

The grep command is limited to string comparison, and is adequate only for the simplest extraction problems. For more complicated extraction criteria, you may wish consider using a unix programming language such as sed, awk, or perl, an SQL or other database package, or a compiled language such as FORTRAN or C.

Following is a template extraction program in C. The program takes, as input from the command line, an input file name, a starting record number, and an ending record number. It reads the designated records from the input file,

and outputs them to standard output. You may insert whatever selection criteria you wish, setting the variable keepdummy=1 if you wish to output the record.

To read the entire input file, you may either modify the program or run the program with beginning record number 1 and an ending record number larger than the number of records in the file.

 

/* Extractor

C Program to Extract Records from the Berkeley Options Data Base

Copyright (C) Stewart Mayhew, March 1995. */

#include <string.h>

#include <stdlib.h>

#include <stdio.h>

/* Main Function Begins Here */

main (int argc, char **argv)

{

char buf[256];

FILE *fp;

char record[60];

long date,thisdate=0,count,ibegin=1,iend;

long offsetnum, numrecs, beginp, endp;

int keepdummy;

/* Check for proper input */

if (argc != 4) {

printf("usage: %s <input file> <first record> <last record>n"

, argv[0]);

exit(0);

}

if ((fp = fopen(argv[1], "r")) == NULL) {

printf("Can't Find file %sn",argv[1]);

exit(0);

}

beginp=atol(argv[2]);

endp=atol(argv[3]);

offsetnum=(beginp-1)*41;

fseek(fp,offsetnum,SEEK_SET);

numrecs=endp-beginp+1;

/* Loop Begins Here */

for (count=1;count<=numrecs;count++) {

if (fgets(buf,256,fp)==NULL)

exit(0);

keepdummy=0;

/* Read Record */

sscanf(buf, "%41c%",

record);

record[41]='0';

 

/* ..........Insert Search Criteria Here...........

 

Set keepdummy=1 to output the record */

 

/* Write Record to Standard Output */

 

if (keepdummy) printf("%s",record);

} /* End of Loop through records */

fclose (fp);

return(0);

}

 

 

6.4 Creating an Index File

Both grep and the Extractor template program will tend to be slow, because they have to read each record sequentially. A single BODB file may contain up to 4 million records, and since you will probably be running the program multiple times on different files, it may take quite a long time to extract all the data you need. In the long run, you can save a lot of time by creating an index that stores information about where the different ticker symbols are located within the file. You will have to read the data sequentially to create the index file, but once created it will dramatically decrease extraction time.

Following is a program that reads a BODB file and writes an index to standard output. The index is simply a list of beginning and ending record numbers for each date/ticker symbol combination.

/* MakeIndex

 

C Program to create an index for a BODB file

Copyright (C) Stewart Mayhew, March 1995 */

#include <string.h>

#include <stdlib.h>

#include <stdio.h>

#include <math.h>

/* Main Function Begins Here */

main (int argc, char **argv)

{

char buf[256];

FILE *fp;

char rtype[10];

char ticker[10];

char thisticker[10];

char datebuf[10];

long date,thisdate=0,count,ibegin=1,iend;

 

/* Check for proper input */

if (argc != 2) {

printf("usage: %s <input file> n", argv[0]);

exit(0);

}

if ((fp = fopen(argv[1], "r")) == NULL) {

printf("Can't Find file %sn",argv[1]);

exit(0);

}

/* Loop Begins Here */

for (count=1;;count++) {

if (fgets(buf,256,fp)==NULL)

break;

/* Read Record */

sscanf(buf, "%2c%3c%6c",

rtype, ticker, datebuf);

rtype[2]='0';

ticker[3]='0';

datebuf[6]='0';

date=atol(datebuf);

/* Check to see if there is a new ticker. If so, output index info. */

if ((strcmp(ticker,thisticker)!=0) || (date != thisdate)) {

iend=count-1;

if (iend>0)

printf("%ld %s %ld %ldn",thisdate,thisticker,ibegin,iend);

ibegin=count;

strcpy(thisticker,ticker);

thisdate=date;

}

} /* End of Loop through records */

 

if (count>=1) {

iend=count-1;

if (iend>0)

printf("%ld %s %ld %ldn",thisdate,thisticker,ibegin,iend);

}

fclose (fp);

return(0);

}

6.5 Using the Index File to Extract Data

This section contains a unix shell program and a modified version of the template Extractor program, that together with an index file, may be used

to extract data from the Berkeley Options Data Base. The shellscript reads in a list of ticker symbols from the file ``bodbread.in'' and a list of BODB filenames from the file ``bodbread.files.'' For each BODB file resXXX, it assumes there exists an index file ``index.resXXX,'' created by the program MakeIndex above. For each file named in bodbread.files, the shellscript creates a temporary extraction file called ``templist'' by grepping the appropriate lines out of the index file. Then, it calls the C program ``IndexExtractor,'' which extracts the specified records from the BODB file.

To extract data, modify the IndexExtractor program as you wish, compile the code using an ANSI-C compiler, create the input files bodbread.in and bodbread.data, then run the shell script:

#!/bin/sh

XX=`cat bodbread.in`

YY=`cat bodbread.files`

for Y in $YY

do

for X in $XX

do

egrep [ ]$X[ ] index.$YY | cut -c 12-26 >>templist

done

IndexExtractor $YY templist > bodbread.out

rm templist

done

 

Here is the code for the Extraction Program:

/* IndexExtractor

C Program to Extract Data from an indexed BODB file

Copyright (C) Stewart Mayhew, March 1995.

#include <string.h>

#include <stdlib.h>

#include <stdio.h>

#include <math.h>

/* Main Function Begins Here */

main (int argc, char **argv)

{

char buf[256];

char buftix[256];

FILE *fp;

FILE *fp2;

char record[60];

long date,thisdate=0,count,ibegin=1,iend;

long offsetnum, numrecs, beginp, endp;

int loop;

 

/* Check for proper input */

if (argc != 3) {

printf("usage: %s <datafile> <extractionfile>n"

, argv[0]);

exit(0);

}

if ((fp = fopen(argv[1], "r")) == NULL) {

printf("Can't find data file %sn",argv[1]);

exit(0);

}

if ((fp2 = fopen(argv[2], "r")) == NULL) {

printf("Can't find extraction file %sn",argv[2]);

exit(0);

}

for (loop=1;;loop++) {

if (fgets(buftix,256,fp2)==NULL) exit(0);

sscanf(buftix, "%ld %ld", &beginp, &endp);

offsetnum=(beginp-1)*41;

fseek(fp,offsetnum,SEEK_SET);

numrecs=endp-beginp+1;

/* Inner Loop Begins Here */

for (count=1;count<=numrecs;count++) {

if (fgets(buf,256,fp)==NULL)

exit(0);

/* Read Record */

sscanf(buf, "%41c%",record);

record[41]='0';

printf("%s",record);

} /* End of Loop through records */

}

fclose (fp);

fclose (fp2);

return(0);

}

 

6.6 Economizing Storage Space

You can decrease the amount of space required to store BODB files by a factor of about 8:1 using a simple compression program. The unix compress program will work fine for this purpose, but we recommend using gzip, which is nearly as universal as compress but uses a more efficient compression algorithm.

In addition, you can reduce storage space by reducing the amount of redundant information in the database. For example, each record contains the date and ticker symbol. If you use the indexing program suggested above, dates and ticker symbols can be recovered from the index file, so you can remove nine characters [3-11] from each record, reducing storage size by 1/4.

If you resort the data by option series, you can modify the index program to create one entry for each series. This will greatly increase the size of the index file, but will save you another eight characters per record. In the vast majority of cases, the last two digits in the strike price field are ``00''. Exceptions are for options on low-priced stocks with strikes separated by $2.50, and options on stocks which have recently split. If you are storing a subset of data that only contains even-dollar-incremented strike prices, you can remove these two characters. If you are storing only equity options, you can remove character [40], which is always zero. If you try hard enough you should be able to reduce a 150 megabyte BODB file to about 10 megabytes. Please be sure to carefully document all formatting changes you make, and be sure never to disturb the data on the original tapes.

 

 

Appendix

A.1 Bibliography of Papers Using Options Transaction Data

A large number of published articles and working papers have used BODB or MDR data to study securities pricing, market microstructure, and other similar topics. Here is a partial list of these many papers. [This section of the user's guide is still under construction.]

Aggarwal, Raj and Edward Gruca, ``Intraday Trading Patterns in the Equity Options Markets,'' Journal of Financial Research v14 n4 (Winter 1993): 285-297.

(Examines intraday patterns in Volume, proportion of small trades proportion of transactions on upticks, quoted price levels, and bid-ask spreads in the options market.)

[Data: BODB (Jul-Dec, 1986)]

Ancel, Esther Weinstock and Ramash K. S. Rao, ``Stock Returns and Option Prices: An Exploratory Study,'' Journal of Financial Research v13 n3 (Fall 1990): 173-185.

(Uses Options Data to back out implied parameters of an option pricing model)

[Data: BODB (Feb-Jul, 1979)]

Bhattacharya, Mihir, ``Empirical Properties of the Black-Scholes option-pricing Formula Under Ideal Conditions,'' Journal of Financial and Quantitative Analysis v15 n5 (Dec 1980): 1081-1105.

(Uses stock returns data on CBOE traded options to test whether the discretely-rebalanced Black-Scholes hedging strategy truly replicates the option)

Bhattacharya, Mihir, ``Transactions Data Tests of Efficiency of the Chicago Board Options Exchange,'' Journal of Financial Economics v12 (1983): 161-185.

(Uses transactions data to test the arbitrage boundary conditions imposed by rational option pricing and the uniformity of implied volatilities across options imposed by the Black-Scholes model)

[Data: BODB (Aug 1976-Jun 1977)]

Chan, Kalok, Y. Peter Chung and Herb Johnson, ``Why Option Prices Lag Stock Prices: A Trading-based Explanation,'' Journal of Finance v48 n5 (Dec 1993): 1957-1967.

(Finds that stock prices lead option prices and attributes this to the larger relative tick size in option markets)

[Data: BODB (Jan-Mar, 1986)]

Diz, Fernando, ``Long and Short-Run Dynamics of Volatility Formation in the S&P 100 Index Options Market: An Empirical Examination,'' Working Paper (1993),

[Data: MDR OEX (1985-1988)]

Diz, Fernando and Thomas J. Finucane. ``Index Options Expirations and Market Volatility,'' Working Paper (1994).

(Examines index option volatility near expiration dates)

[Data: MDR OEX (1985-1988)]

Diz, Fernando and Thomas J. Finucane, ``The Rationality of Early Exercise Decisions: Evidence from the S&P Index Options Market,'' Review of Financial Studies v6 n4 (Winter 1993): 765-797.

[Data: MDR OEX (Apr 1983-Dec 1988)]

Diz, Fernando and Thomas J. Finucane, ``The Time Series Properties of Implied Volatility of S&P Index Options,'' Journal of Financial Engineering (June 1993).

[Data: MDR OEX (Jan 1984-Aug 1987)]

Frankfurter, George M. and Wai K. Leung, ``Further Analysis of the Put-Call Parity Implied Risk-Free Interest Rate,'' Journal of Financial Research v14 n3 (Fall 1991): 217-232.

(Uses option prices to back out the interest rate implied by the put-call parity relationship, and compares the implied rates with T-bill rates)

[Data: BODB (5 stocks, 1982-1983]

Kamara, Avraham and Thomas W. Miller, Jr., ``Daily and Intradaily Tests of European Put-Call Parity,'' Journal of Financial and Quantitative Analysis v30 n4 (Dec 1995): 519-539.

(Tests the put-call parity relationship for SPX options)

[Data: BODB (SPX Jan-Mar 1989)]

Mayhew, Stewart, Atulya Sarin and Kuldeep Shastri, ``The allocation of informed trading across related markets: An analysis of the impact of changes in equity-option margin requirements,'' Journal of Finance v50, n5 (Dec 1995):1635-1653.

(Examines the effect of option margin requirements on the underlying

stock market.)

Peterson, David R., ``A Transaction Data Study of Day-of-the-Week and Intraday Patterns in Options Returns,'' Journal of Financial Research v13 n2 (Summer 1990): 117-131.

(Examines intertemporal patterns in options returns)

[Data: BODB (consolidated, 1983-1985)]

Rubinstein, Mark and Anand M. Vijh. ``The Berkeley Options Data Base: A Tool for Empirical Research,'' Advances in Futures and Options Research v2 (1987): 209-221.

(Describes the Berkeley Options Data Base)

Sheikh, Aamir M. and Ehud I. Ronn, ``A Characterization of the Daily and Intra-day Behavior of Returns on Options,'' Journal of Finance v49 n2 (Jun 1994): 557-579.

(Examines intertemporal patterns in options returns, correcting for changes in underlying stock prices)

[Data: BODB (Jan 1986-Sep 1987)]

 

A.2 List of BODB Files

 

Filename Dates Filename Dates
res01

res02

res03

res04

res05

res06

res07

res08

res09

res10

res11

res12

res13

res14

res15

res16

res17

res18

res19

res20

res21

res22

res23

res24

res25

res26

res27

res28

res29

res30

res31

res32

res33

res34

res35

res36

res37

res38

res39

res40

res41

res42

res43

res44

res45

res46

res47

res48

res49

res50

res51

res52

760823—761119

761122—770218

770222-770520

770523-770819

770822-771021

771024-771230

Jan-Feb 1978

Mar-Apr 1978

May--Jun 1978

Jul--Aug 1978

Sep--Dec 1978

Jan--Mar 1979

Apr--Jul 1979

Aug--Sep 1979

Oct--Nov 1979

Dec 1979

Jan--Feb 1980

Mar--Apr 1980

May--Jun 1980

Jul--Aug 1980

Sep 1980

Oct 1980

Nov--Dec 1980

Jan--Feb 1981

Mar 1981

Apr 1981

May 1981

Jun 1981

Jul 1981

Aug 1981

Oct 1981

Nov 1981

Dec 1981

Jan 1982

Feb 1982

Mar 1982

Apr 1982

May 1982

Jun 1982

Jul 1982

Aug 1982

Sep 1982

Oct 1982

Nov 1982

Dec 1982

Jan 1983

Feb 1983

Mar 1983

Apr 1983

May 1983

Jun 1983

Jul 1983

res53

res54

res55

res56

res57

res58

res59

res60

res61

res62

res63

res64

res65

res66

res67

res68

res69

res70

res71

res72

res73

res74

res75

res76

res77

res78

res79

res80

res81

res82

res83

res84

res85

res86

res87

res88

res89

res90

res91

res92

res93

res94

res95

res96

res97

res98A

res98B

res99A

res99B

res100A

res100B

Aug 1983

Sep 1983

Oct 1983

Nov 1983

Dec 1983

Jan 1984

Feb 1984

Mar 1984

Apr 1984

May 1984

Jun 1984

Jul 1984

Aug 1984

Sep 1984

Oct 1984

Nov--Dec 1984

Jan 1985

Feb 1985

Mar--Apr 1985

May--Jun 1985

Jul 1985

Aug 1985

Sep 1985

Oct 1985

Nov 1985

Dec 1985

Jan 1986

Feb 1986

Mar 1986

Apr 1986

May 1986

Jun 1986

Jul 1986

Aug 1986

Sep 1986

Oct 1986

Nov 1986

Dec 1986

Jan 1987

Feb 1987

Mar 1987

APR 1987

MAY 1987

Jun 1987

Jul 1987

Aug I 1987

Aug II 1987

Sep I 1987

Sep II 1987

Oct I 1987

Oct II 1987

 

Filename Dates Filename Dates Filename Dates
res101

res102

res103

res104

res105

res106

res107

res108

res109

res110

res111

res112

res113

res114

res115

res116

res117

res118

res119

res120

res121

res122

res123

res124

res125

res126

res127

res128

res129

res130

res131

res132

res133

res134

res135

res136

res137

res138

res139

res140

res141

res142

Nov 1987

Dec 1987

Jan 1988

Feb 1988

Mar 1988

Apr 1988

May 1988

Jun 1988

Jul 1988

Aug 1988

Sep 1988

Oct 1988

Nov 1988

Dec 1988

Jan 1989

Feb 1989

Mar 1989

Apr 1989

May 1989

Jun 1989

Jul 1989

Aug 1989

Sep 1989

Oct 1989

Nov 1989

Dec 1989

Jan 1990

Feb 1990

Mar 1990

Apr 1990

May 1990

Jun 1990

Jul I 1990

Jul II 1990

Aug I 1990

Aug II 1990

Sep I 1990

Sep II 1990

Oct I 1990

Oct II 1990

Nov I 1990

Nov II 1990

res143

res144

res145

res146

res147

res148

res149

res150

res151

res152

res153

res154

res155

res156

res157

res158

res159

res160

res161

res162

res163

res164

res165

res166

res167

res168

res169

res170

res171

res172

res173

res174

res175

res176

res177

res178

res179

res180

res181

res182

res183

res184

Dec 1990

Jan I 1991

Jan II 1991

Feb I 1991

Feb II 1991

Mar I 1991

Mar II 1991

Apr I 1991

Apr II 1991

May I 1991

May II 1991

Jun I 1991

Jun II 1991

Jul I 1991

Jul II 1991

Aug I 1991

Aug II 1991

Sep I 1991

Sep II 1991

Oct I 1991

Oct II 1991

Nov I 1991

Nov II 1991

Dec I 1991

Dec II 1991

Jan I 1992

Jan II 1992

Feb I 1992

Feb II 1992

Mar I 1992

Mar II 1992

Apr I 1992

Apr II 1992

May I 1992

May II 1992

Jun I 1992

Jun II 1992

Jul I 1992

Jul II 1992

Aug I 1992

Aug II 1992

Sep I 1992

res185

res186

res187

res188

res189

res190

res191

res192

res193

res194

res195

res196

res197

res198

res199

res200

res201

res202

res203

res204

res205

res206

res207

res208

res209

res210

res211

res212

res213

res214

res215

res216

res217

res218

res219

res220

res221

res222

res223

res224

res225

res226

res227

Sep II 1992

Oct I 1992

Oct II 1992

Nov I 1992

Nov II 1992

Dec I 1992

Dec II 1992

Jan I 1993

Jan II 1993

Jan III 1993

Feb I 1993

Feb II 1993

Mar I 1993

Mar II 1993

Mar III 1993

Apr I 1993

Apr II 1993

Apr III 1993

May I 1993

May II 1993

May III 1993

Jun I 1993

Jun II 1993

Jun III 1993

Jul I 1993

Jul II 1993

Jul III 1993

Aug I 1993

Aug II 1993

Aug III 1993

Sep I 1993

Sep II 1993

Sep III 1993

Oct I 1993

Oct II 1993

Oct III 1993

Oct IV 1993

Nov I 1993

Nov II 1993

Nov III 1993

Dec I 1993

Dec II 1993

Dec III 1993

 

 

Starting from 1994, the naming convention has been changed to the following simpler form.

Filename Dates Filename Dates
res94001

res94002

res94003

res94004

...

res94046

  res95001

res95002

res95003

res95004

...

res95106

 

 

A.3 Ticker Symbol Identification

A.3.1 Ticker Symbols for CBOE Index Options

Ticker Index Exercise Style
OEX

OEZ

CPO

SPX

SPZ

NSX

SPL

SPQ

CPS

BIX

BGX

CEX

CWX

EVX

GAX

GTX

HCX

IUX

RIX

RLX

TCX

TRX

FSX

ISX

MEX

MZX

NIK

NDX

RUT

SGX

SVX

S&P 100 Index

S&P 100 Index - OEX strike overflow

S&P 100 Index - CAPS

S&P 500 Index

S&P 500 Index - SPX strike overflow

S&P 500 Index - PM Expiration

S&P 500 Index - Long-Dated

S&P 500 Index - End-of-Quarter

S&P 500 Index - CAPS

S&P Banking Index

CBOE BioTech Index

S&P Chemical Index

CBOE Computer Software Index

CBOE Environmental Index

CBOE Gaming Index

CBOE Global Telecommunications Index

S&P Health Care Index

S&P Insurance Index

CBOE REIT Index

S&P Retail Index

CBOE U. S. Telecommunications Index

S&P Transportation Index

FT-SE 100 Index

CBOE Israel Index

CBOE Mexico Index

CBOE Mexico Index (MEX strike overflow)

Nikkei 300 Index

NASDAQ 100 Index

Russell 2000 Index

S&P/Barra Growth Index

S&P/Barra Value Index

American

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

European

 

 

 

 

 

 

 

 

 

A.3.2 Ticker Symbols for Index LEAPS

Ticker Index Expiration

OAX S&P 100 Index 1993

OBX S&P 100 Index 1994

OLX S&P 100 Index 1992, 1995

OCX S&P 100 Index 1996

LSW S&P 500 Index 1993

LSY S&P 500 Index 1994

LSX S&P 500 Index 1992, 1995

LSZ S&P 500 Index 1996

WRU Russell 2000 Index 1994

VRU Russell 2000 Index 1995

LRU Russell 2000 Index 1996

WBG CBOE BioTech Index 1994

VBG CBOE BioTech Index 1995

 

Ticker Index Expiration

LBG CBOE BioTech Index 1996

VEX CBOE Mexico Index 1995

VNX Nikkei 300 Index 1995

 

A.3.3 Ticker Symbols for Interest Rate Options

Ticker Underlying
IRX

VXB

LXB

FVX

VXV

LXV

TNX

VXN

LXN

TYX

VYY

LTY

LTX

13-week T-bill

13-week T-bill (1995 LEAP)

13-week T-bill (1996 LEAP)

5-year Note

5-year Note (1995 LEAP)

5-year Note (1996 LEAP)

10-year Note

10-year Note (1995 LEAP)

10-year Note (1996 LEAP)

30-year Bond

30-year Bond (1995 LEAP)

30-year Bond (1996 LEAP)

Weighted Average Long-Term Rate (discontinued)

 

 

 

 

 

A.3.4 Ticker Symbols for CBOE Equity Options

(As of July 28, 1993)

Ticker

Listed

Delisted

Company Name

KKQ

921006

ACCLAIM ENTERTAINMENT, INC.
ACT

881027

ACTUA GROUP
ADT

911021

ADT LIMITED
AVQ

920508

ADVANTA CORPORATION CL. A
ABQ

930313

ADVANTA CORP., CLASS B
AFP

881219

AFFILIATED PUBLICATIONS INC.
AFQ

920618

AFFYMAX N.V.
AQG

930709

AGNICO-EAGLE MINES LTD.
ABF

901030

AIRBORNE FREIGHT CORPORATION
ALC

930423

ALC COMMUNIATIONS CORP.
ALA

921204

ALCATEL ALSTHOM ADR
AAL

840221

ALEXANDER & ALEXANDER SERVICES
AEG

750922

ALLEGIS
AYQ

911121

ALLIANCE PHARMACEUTICAL CORP.
ATK

910301

ALLIANT TECHSYSTEMS INC
ALS

841226

861231

ALLIED STORES
AA

741217

ALUMINUM CO. OF AMERICA
AU

930503

AMAX GOLD INC.
AMH

810629

AMDAHL CORPORATION
AEP

750523

AMERICAN ELECTRIC POWER CO.
AXP

770518

AMERICAN EXPRESS CO.
AGC

850730

AMERICAN GENERAL
AGQ

850603

910722

AMERICAN GREETINGS
AHS

750623

851125

AMERICAN HOSPITAL SUPPLY
AIT

850813

AMERICAN INFO TECHNOLOGY
AIG

841022

AMERICAN INT'L GROUP
PWQ

910816

AMERICAN POWER CONVERSION CORP
ASC

880810

AMERICAN STORES
T

730426

AMERICAN TELEPHONE AND TELEGRA
AIT

850813

AMERITECH
ATQ

900123

920629

AMER. T.V. & COMMUNICATIONS CL
AN

750624

AMOCO
AMP

750926

AMP INCORPORATED
APC

870420

ANADARKO PETROLEUM
AQN

930503

ANDREW CORP.
APA

800725

880915

APACHE CORPORATION
APQ

850603

860703

APOLLO COMPUTER INC.
AAQ

850603

910521

APPLE COMPUTER INC.
APM

881219

APPLIED MAGNETICS CORP.
ARA

841022

841219

ARA SERVICES
OIQ

920928

ARTISOFT, INC.
ARC

730426

ATLANTIC RICHFIELD CORP.
AIQ

920319

ATLANTIC SOUTHEAST AIRLINES
AQT

930604

ATMEL CORPORATION
URQ

920131

AURA SYSTEMS, INC.
TQO

930707

AUTOTOTE (CLASS A)
AZO

920127

AUTOZONE, INC.
AVP

730801

AVON PRODUCTS INC.
AQR

930406

AZTAR CORP.
JBQ

920928

BAKER (J), INC.
BLY

930709

BALLY MANUFACTURING CORPORATIO
BLY

770301

910521

BALLY MANUFACTURING CORP.
BDG

930201

BANDAG, INC.
BK

881219

BANK OF NEW YORK COMPANY, INC.
BKQ

930218

BANK SOUTH CORP.
BAC

760701

BANKAMERICA CORPORATION
BLH

930701

BANKERS LIFE HOLDING CORP.
BNQ

930210

BANYAN SYSTEMS INC.
BTI

921204

BAT INDUSTRIES ADR
BMG

870717

BATTLE MOUNTAIN GOLD CORP.
BAX

750523

BAXTER INTERNATIONAL, INC.
BBQ

900417

BAYBANKS, INC.
BCE

901204

920118

BCE INC.