The Berkeley Options Data Base
User's Guide
The Berkeley Options Data Base
Institute of Business and Economic Research #1922
University of California, Berkeley
Berkeley CA 94720-1922
Release 3.0
Revised August 1998
1 Introduction
The Berkeley Options Data Base (BODB) is a complete record of trading activity on the floor of the Chicago Board Options Exchange (CBOE). Derived from the CBOE's Market Data Retrieval (MDR) tapes, the database contains every bid-ask quote and every trade recorded on the floor of that exchange, time-stamped to the nearest second. The database begins on August 23, 1976 and is updated annually, usually in March or April. At the time of this printing (June 1, 1995), the database is available through December 31, 1994.
This document is intended to be a comprehensive user's guide to the Berkeley Options Data Base. It describes the database in detail, and contains valuable information on how to acquire, access, and conduct research using the database. As the database is currently designed to be accessed through a unix workstation, specific technical details are offered only for the unix platform. For use on other systems, it is necessary to find a utility that can read a unix tarfile. If you have any further questions about the database, or comments about this guide, please contact the database manager.
This guide is organized as follows. Section 2 describes the MDR file, which contains the raw data from which BODB is derived. It explains how the MDR file is created, how the tapes may be purchased, how the information is stored, and how the data are organized. Section 3 describes the BODB, how it may be purchased, how it is stored, and how the data are organized. Section 4 contains some historical information about the BODB, that may help you understand things you have read elsewhere about the database. It describes former BODB products that have been discontinued, such as the "consolidated format" and the "supplemental tape" and describes format changes in the ``resorted'' data. Section 5 is a research guide, containing pointers on how to identify ticker symbols, how stock splits are treated, and other such matters. It also describes various other data sources that may be of interest. Section 6 contains technical advice on how to read the data off the tapes and sample computer programs (in various languages) that may be used to read and manipulate the data. The appendix contains a bibliography of research papers that have employed the BODB or MDR data, a complete list of all BODB files and the calendar dates contained in each, and a list of ticker symbols.
2 The Market Data Retrieval File
The Market Data Retrieval (MDR) file is produced by the CBOE, and constitutes a complete record of bid-ask quotes and trades recorded on the floor of that exchange. Each record is time-stamped to the nearest second, and contains the contemporaneous price of the underlying security.
2.1 Availability
The entire MDR may be purchased from the CBOE, by the month, on 6250 bpi magnetic tapes. As of January, 1995, the rate was $500 per month of data, with a 10% discount on orders of six months or more, and a 15% discount on orders of one year or more. The MDR data for S&P 100 (OEX) index options may be purchased separately at a price of $450 for three years of data. Data for S&P 500 (SPX) index options may be purchased at the same rate. Other individual securities from the MDR may be purchased at the rate of $100 per month for the first security and $30 per month for each additional security.
The CBOE also offers various summary files. The ``Option Summary'' file contains daily high, low, and closing prices, trading volume and open interest, since October 1, 1985. One year of the Option Summary file for up to ten securities may be purchased for $125 on 6250 bpi tapes or $100 on floppy disk or hardcopy. The ``Expanded Option Summary'' file contains all the information in the Option Summary file plus the underlying stock price. One year of the Expanded Option Summary file for up to ten securities may be purchased for $150 on 6250 bpi tapes or $125 on floppy disk or hardcopy. The ``Index Summary'' file contains daily information for the various indexes on which options are traded at the CBOE, including the daily high, low, and closing index value, the change from the previous close, total trading volume on calls, total trading volume on puts, total open interest on calls, and total open interest on puts. The Index Summary file may be purchased at the rate of $25 per index on floppy disk, in which case data are available beginning October 1, 1985, or $15 per index hardcopy, in which case the data are available beginning March 11, 1983.
The ``Volume Summary'' file contains a daily observations of total calls volume, total puts volume, total calls open interest, and total puts open interest for any option class trading on the CBOE. One year of data on up to ten securities may be purchased on hardcopy or floppy disk for $75. Finally, the ``Total CBOE Volume Summary'' file contains daily observations of total call volume and total put volume on the CBOE. The entire file, which begins in January, 1978, may be purchased on floppy disk for $25 or hardcopy for $15. For ordering information, contact the CBOE data sales hotline at (312) 786-7426. To obtain information on contract specifications and ticker symbols, call the CBOE marketing department at (312) 786-7434.
2.2 Data Entry
The MDR contains four main types of records: trade records, quote records, cancel records, and underlying records. Quote records contain bid and ask prices, while trade records contain transaction price and volume. Cancel records, as the name indicates, cancel previous records on the same underlying contract. Trade, quote, and cancel records are all time-stamped, and contain a contemporaneous observation of the underlying stock price. Underlying records contain information about the underlying stock that is recorded on the MDR without a trade, quote, or cancel having occurred.
Some quote records are recorded on the floor of the exchange by a Quote Reporting Terminal Operator, who enters bid-ask quotes as they are shouted in the trading crowd. The reporting lag for quotes should be very short, only as long as is required for the terminal operator to enter the option identification and the quote, which should be less than five seconds. In addition, many options are now quoted through the ``autoquote'' system. In this case, a market maker chooses the input parameters for a Black-Scholes or Cox-Ross-Rubinstein pricing model, and bid-ask quotes are automatically updated by computer whenever the underlying stock price changes. The autoquote system has led to a large increase in the volume of data recorded on the MDR over the last few years. A large portion of the recent MDR data is made up of quotes on index options, where the underlying index is recalculated every 15 seconds, and the autoquote system continually spits out fresh quotes.
Trade records are recorded by a Price Reporting Terminal Operator. The reporting delay for trades may be considerably longer than for quotes. After a verbal agreement to a trade has been consummated between two members of the trading crowd, the seller writes up the trade on a blank ticket he is carrying, and deposits a copy of the sell ticket on a conveyor belt at the post. This process generally takes from 5 to 40 seconds, depending on the number of traders involved, how fast they write, and how far they are from the conveyor belt. When trading is particularly active, traders might hold onto these tickets for up to several minutes before depositing them on the conveyor belt. Upon receiving the ticket, the Terminal Operator immediately removes the ticket from the bin, with a single key stroke simultaneously enters the stock symbol, expiration month, and strike price, and then separately enters the number of contracts traded, the transaction price, and the identifying symbols of the buying and selling floor traders.
The computer completes the record by automatically registering the time of day and the most recent transaction price of the underlying stock. One terminal operator handles call options and another puts, at separate terminals. In special circumstances, the Terminal Operator will also enter a ``transaction prefix,'' indicating, for example, that the trade is known to be part of a spread order, or is known to be out of sequence. Because trades take longer to record than quotes, great care should be taken in interpreting the time sequence reported in the MDR or Berkeley Options Data Base.
2.3 Storage Format
The MDR may be acquired from the CBOE on standard or non-labelled 6250 bpi magnetic tapes. The Berkeley Options Data Base receives the MDR on non-labelled tapes, on which the MDR is stored as a large fixed-length EBCDIC file, on multiple tapes, with record length 61 and blocksize 32757. Because the file is stored as a multiple-tape file, each MDR tape is crammed full of data. On a unix system, with 9-track tape drive designated /dev/rst1, the data may be transferred to hard disk using the command:
dd if=/dev/rst1 of=mdrdata bs=32757 cbs=61 conv=ascii
The if= option specifies the input file or device, the of= option specifies the output file, bs designates the blocksize, conv instructs the program to convert the data (in this case from EBCDIC to ASCII), and cbs (conversion buffer size) tells the program to split the file into lines as it converts it. Note that the cbs option only works when the conv option is specified.
This will create a fixed-length file of around 257,000,000 bytes. This number is slightly larger than the capacity of a 9-track tape because the conversion program adds end-of-line characters.
2.4 Sorting Order
Within the MDR, records are sorted first by date, then by underlying security. Within each underlying security calls are listed before puts. The calls (puts) are sorted in order of expiration month, within each expiration month by strike price, and within each strike price the records are listed chronologically. In summary, records are sorted according to the scheme
DATE : UNDERLYING ASSET : CALL/PUT : EXPIRATION : STRIKE : TIME
2.5 Record Layout
The MDR file contains fixed-length records with the following (undelimited) fields:
| Description | Field Type | Length |
| Trade Date Option Class Expiration Month Symbol Strike Price Symbol Trade Price Integer Trade Price Fraction Volume Bid Price Integer Bid Price Fraction Ask Price Integer Ask Price Fraction Stock Price Integer Stock Price Fraction Extra Space Security Type Symbol Record ID Prefix Code Put/Call Code Expiration Month Strike Price |
Numeric Alpha Alpha Alpha Numeric Numeric Numeric Numeric Numeric Numeric Numeric Numeric Numeric Numeric Numeric Alpha Alpha Alpha Alpha |
6 3 1 1 3 3 5 3 3 3 3 5 1 1 1 2 4 1 3 3 |
Here are a few sample lines from the MDR file:
920922GPSOG142911000000000000040000040040003320102 PMAR035
920922GPSOG143914000000000000040000040120003310102 PMAR035
920922GPSOG144049004004000050000000000000003310101 PMAR035
920922GPSOH083237000000000000070120070240003330102 PMAR040
920922GPSOH083602000000000000070160070280003330102 PMAR040
920922GPSOH083651000000000000070120070240003330102 PMAR040
920922GPSOH084614000000000000070080070200003340102 PMAR040
920922GPSOH085117000000000000070120070240003350102 PMAR040
920922GPSOH085611000000000000070080070200003350102 PMAR040
2.5.1 Date and Time
These transactions were recorded on September 22, 1992. This can be ascertained from the first six characters of each line [1-6], which contain the date in the form YYMMDD.
Characters [12-17] contain the timestamp, recorded in Central standard time, in the form HHMMSS. Thus, the third record was recorded at 2:40:49 P.M., and the fourth record was recorded at 8:32:37 A.M.
2.5.2 Ticker Symbol
Between the date and time are five characters [7-11] containing a ticker symbol that completely identifies the option contract.Note that only the first three of these characters appear in the Berkeley Options Data Base.
The first three letters are unique to the underlying asset. One or two of these characters may be blank. In the example above, the letters GPS signify that these are equity options on Gap Stores stock. For options on stocks traded on the New York Stock Exchange or the American Stock exchange, these three letters will generally be the same as the underlying stock market ticker symbol. For options on NASDAQ stocks, the option ticker usually contains two letters from the NASDAQ ticker (not necessarily the first two), plus the letter Q. So, for example, NASDAQ ticker symbol AAPL might become AAQ. Other ticker symbols indicate index options, LEAPS, interest rate options, or other types of options. For more details, see the section on ticker symbols, below.
The fourth letter of the ticker symbol identifies the contract type and expiration date. The letters A through L designate call options expiring in January (A) through December (L). The letters M through X represent put options expiring in January (M) through December (X). In this example, the letter O designates a March put option.
The fifth letter identifies the last two digits in the option's strike price. The letter A signifies a strike price ending in 05, B is a strike ending in 10, and so on. In the example above, the first three records are for options with a strike price ending in 35 (G), and the last six are for options with a strike price ending in 40 (H).
The five-letter ticker symbols provide a convenient way to sort the data. In fact, the MDR file is sorted, within each calendar day, according to the five-letter ticker symbol. This means that, as described above, the data are sorted by underlying firm, and within each firm the calls are separated from the puts, calls and puts are each sorted by expiration month, and within each expiration month in order of strike price. The information contained in the fourth and fifth letters of the ticker symbol is duplicated in the last seven characters of each line [55-61], where it is presented in a more intuitive format. In this case, the last seven characters are PMAR035 or PMAR040, indicating the March 35 and March 40 puts.
2.5.3 Record Type and Prefix Code
Immediately to the left of these last seven characters is a six-character field [49-54] including a numeric record type,(indicating whether it is a trade, quote, cancel, or underlying record) and on occasion a four-letter ``Prefix'' code, indicating additional information about the record. The four record types are:
| Number | Type |
| 01 | Trade |
| 02 | Quote |
| 03 | Cancel |
| 04 | Underlying |
and the prefix codes are:
| Code | Meaning |
| ROTA | Opening Rotation |
| ENDR | End of Opening Rotation |
| AUTO | Start of RAES, the Electronic Execution System |
| RAES | Transaction was Executed Electronically |
| ENDA | End of RAES |
| OPEN | Opening Trade, Recorded Late, Out of Sequence |
| OPNL | Opening Trade, Recorded Late, In Sequence |
| OSEQ | Recorded Late, Out of Sequence |
| LATE | Recorded Late, In Sequence |
| SPRD | Record is Part of a Combination Trade |
| STDL | Record is Part of a Straddle |
| FAST | Recorded under Fast Trading Conditions |
| HALT | Reopen After Trading Halt |
| CLOS | Closing Record |
| CNCO | Cancel the Opening Trade |
| CANC | Cancel the Last Trade, if it is not the Opening Trade |
| CNCL | Cancel Another Trade, not the Last or Opening Trade |
| CNOL | Cancel the Only Trade of the Day |
In the sample above, the third record is a trade record, and the others are all quote records.
2.5.4 Trade Price and Contract Volume
For trade records, the transaction price is recorded at [18-23], with the integer portion of the price recorded in the first three bytes [18-20] and the fractional part, recorded in thirty-seconds of a dollar, in the next three
[21-23]. For example, in the third record in the above sample, the price is recorded as 004004, which translates to $4 1/8. Contract volume is recorded in the field [23-28], which in this case is 00005, or 5 contracts. Both the trade price field and the contract volume field will be empty for quote records.
2.5.5 Bid and Ask Prices
For quote records, the bid price is recorded in the field [29-34] and the ask price is recorded in the field [35-40]. As for trade prices, bid and ask prices are recorded as a three-digit integer followed by a three-digit fraction, denominated in thirty-seconds of a dollar. The first record in the sample above has a bid price of $4 (004000) and ask price of $4 1/8 (004004). The last record has a bid of $7 1/4 (007008) and an ask of $7 5/8 (007020). The bid and ask fields will be empty for trade records.
2.5.6 Underlying Asset Price
The underlying asset price is recorded at locations [41-46]. For equity options, the integer portion of the stock price is contained in the first five bytes, and the sixth byte contains the fractional portion of the price, denominated in eighths of a dollar. Thus, the first record in the sample data above was recorded when the current stock price for Gap Stores was $33 1/4 (000332), and the second was recorded when the underlying was at $33 1/8 (000331). For index options, the underlying value is usually recorded with the hundreds digit left-justified in this field, so that an OEX value of 455.31 would be recorded as 451310. The MDR database is not always consistent in the way index options are recorded.
2.5.7 Option Type Identifier
In byte [48] of the MDR record is a number that identifies the type of underlying security, according to the following table:
| Number | Security Type |
| 1 | EQTY |
| 2 | GNMA |
| 3 | TBND |
| 4 | EQTY Group |
| 5 | EQTY Index |
| 6 | INDEX |
| 7 | FCO |
3 The Berkeley Options Data Base
The Berkeley Options Data Base (BODB) is associated with the Institute of Business and Economic Research at the University of California, Berkeley. Through a contractual arrangement with the CBOE, BODB offers a reprocessed
version of the Market Data Report, beginning August 23, 1976. The database is updated annually, and is currently available through December 31, 1994. The database is managed by a graduate student in the finance group at the Haas School of Business, under the direction of professor Mark Rubinstein. At the time of this printing, (August 17, 1998) the current database manager is Kehong Wen. You may contact BODB via e-mail to options@haas.berkeley.edu, by phone at (510) 643-8893, by fax at (510) 642-5018, or by mail to the Berkeley Options Data Base, Institute of Business and Economic Research MC #1922,
Berkeley, CA 94720-1922.
3.1 Availability
The Berkeley Options Data Base is available on 8mm magnetic tapes, sometimes known as ``Exabyte'' tapes. For data prior to 1990, one year of data are stored on a single 8mm tape. Beginning with 1990, six months of data are stored on a single tape. To access the data, it is necessary to use the unix ``tar'' (tape archive) utility. In addition, at least 200 megabytes of hard disk space are required. Normally, the tapes are written using an Exabyte 8500 drive, and cannot be read by the (lower-density) Exabyte 8200 drive. However, the tapes may be written at low-density through special arrangement with the database manager.
Due to the increasing volume of data, data beginning in January 1994, will be compressed using the {bf gzip} compression facility.
Plans are underway to also offer the database on 4mm ``dat'' tapes and eventually, CD ROM. Through special arrangement, the data may also be purchased through the internet via ftp. The database is no longer available on 6250 bpi 9-track tapes.
The Berkeley Options Data Base may be purchased at the rate of $200 per month of historical data, with a minimum purchase of six months. In addition, there is a processing charge of $80 per 8mm tape.
Customers in California must also pay sales tax. Academic institutions may acquire the data at a special rate of $150 per month. Academic institutions qualify for a volume discount rate of $120 per month if they have cumulatively purchased 36 months of data. BODB does not currently offer subsets of the database. If you want more than one copy of the tape, for example if one of your tapes is lost or damaged, there is a processing charge of $80 per tape.
To purchase the data, it is necessary to sign a ``subscription agreement'' contract. If your institution already has a contract on file at the BODB from a previous purchase, it is not necessary to sign another contract. To obtain a contract and order form, contact the database manager.
3.2 Storage Format
The BODB is stored in fixed-length ASCII files, archived in unix tarfiles on 8mm tapes. To restore a file onto hard disk on a unix system where the 8mm tape drive is designated /dev/rst5, place the tape in the drive, make sure there is more than 161 MB of disk space in the current partition, and issue the command
tar xvf /dev/rst5 filename
where filename is the name of the file you wish to restore. To list all the files on a tape, issue the command
tar tvf /dev/rst5
This might take as long as two hours. If the filenames end with a .gz suffix, they have been compressed, and must be uncompressed using the publicly available program gunzip.
Prior to December, 1993, the filenames on the tape are of the form resXXX, and are the same as the names of the 6250 bpi tapes on which the data were previously stored. The first file (from 1976) is res01, the December, 1993 file is res227, and the file numbers in between are consecutive. The only anomalies are the files for August, September, and October, 1987 (the first three months that originally required more than one 6250 bpi tape), which are named res98A and res98B, res99A and res99B, res100A and res100B. For a complete listing of filenames, see Appendix B. Beginning in January, 1994, filenames are of the form resYYXXX, so the first file in 1994 is res94001, and the last is res94046.
3.3 Sorting Order
The Berkeley Options Data Base is currently available only in the ``resorted'' format. It is called the resorted format because it is little more than a resorted version of the MDR. The processing program alters certain fields of the MDR records to make them easier to interpret, performs a few screens for bad or duplicate records, and changes the sorting order. While the MDR is sorted according to five-letter ticker symbol, BODB is sorted according to three-letter ticker symbol. This means that in the BODB, all the day's records on the same underlying stock are ordered chronologically, regardless of expiration or strike, unlike the MDR, where records are further sorted by option contract.
In summary, the BODB is sorted according to the scheme
DATE : UNDERLYING ASSET : TIME
Beginning in January, 1994, records occuring in the same second are further sorted according to record type and option contract. The new sorting scheme is:
DATE : UNDERLYING ASSET : TIME : RECORD TYPE : EXPIRATION : PUT/CALL : STRIKE
3.4 Record Layout
The resorted data are contained in a fixed-length file with the following
(undelimited) fields:
| Description | Field Type | Length |
| Record Type | Numeric | 2 |
| Ticker Symbol | Alpha | 3 |
| Date | Numeric | 6 |
| Time | Numeric | 6 |
| Expiration Month | Numeric | 2 |
| Strike Price | Numeric | 6 |
| Bid Price or Trade Price | Numeric | 5 |
| Ask Price or Volume | Numeric | 5 |
| Underlying Asset Price | Numeric | 5 |
Here are a few sample records from the Berkeley Options Data Base:
43IBM930104084001 1-08500033500337505140
2IBM930104084014 2 04500007500080005140
1IBM930104084021 1-05500003880000305140
2IBM930104084021 2-04500001250013105140
2IBM930104084034 2 05000004000042505140
1IBM930104084038 2 05000004000000405140
2IBM930104084040 2-05000002880031305140
1IBM930104084044 1 05500000630004105140
2IBM930104084052 2 05500001750018105140
1IBM930104084053 1 05500000630001005140
3.4.1 Record Type
The first field [1-2] determines whether the record is a trade or a quote, and also incorporates the information in the MDR prefix code. The MDR data type and prefix codes are translated into BODB record types according to the following table:
| MDR Code | BODB Record Type |
| 04HALT 01 02 01SPRD 01STDL 02HALT 01LATE 01OSEQ 01OPEN 01OPNL 03LATE 03OSEQ 03OPEN 03SPRD 03STDL 03CNCO 03CNCL 03CANC 03CNOL 03OPNL 04AUTO 01RAES 04END 02ROTA 04ROTA 04ENDR 02AUTO 04END 04ENDF 04FAST 01FAST 02FAST 01CLOS 02CLOS 04CLOS 03REOP 02ZZZZ |
0 1 2 3 4 5 6 7 8 9 20 21 22 23 24 25 26 27 28 29 40 41 42 43 44 45 46 47 48 60 61 62 63 64 65 66 67 |
3.4.2 Ticker Symbol
In the Berkeley Options Data Base, the underlying ticker symbol, which is located in field [3-5], is copied directly from the first three characters of the MDR ticker symbol. As mentioned above, this ticker usually corresponds to
the stock exchange ticker symbol for NYSE and AMEX stocks, but not for NASDAQ stocks. More details on ticker symbols are given in the ``Research Guide''
section of this document, and a list of CBOE ticker symbols is contained in Appendix C.
3.4.3 Date and Time
The BODB date and time fields are copied exactly from the corresponding fields in the MDR. Thus, the date and time, recorded in characters [6-17], are given in the form YYMMDDHHMMSS.
3.4.4 Expiration Month
Instead of using the CBOE's alphabetic expiration codes, the Berkeley Options Data Base reports the expiration month in numeric form in characters [18-19].
The records in the above sample are all for options expiring in January ([space]1) or February ([space]2). Details on how to determine exact expiration dates is contained in the ``research guide'' section below.
3.4.5 Strike Price and Call/Put Indicator
The BODB contains the strike price, denominated in cents, in location [21-25]. Puts are indicated by a negative sign in location [20]. Thus, the first record above is for a (January) 85 put, and the last record is for a (January 55 call).
3.4.6 Trade Prices, Contract Volume, Bid and Ask Prices
Because the MDR trade fields are always empty for quote records, and its quote fields are empty for trade records, the BODB is able to save space by recording both fields in the same location. For trade records, the price is recorded
at [26-30] and contract volume is recorded at [31-35]. For quote records, the bid price is recorded at [26-30] and the ask price is recorded at [31-35].
Instead of recording prices in thirty-seconds of a dollar, the BODB converts the fractional portion of the MDR bid, ask, and transaction prices to pennies, rounding off according to the following rule:
| Thirty-Seconds | Cents | Thirty-Seconds | Cents | Thirty-Seconds | Cents |
| 001 002 003 004 005 006 007 008 009 010 011 |
03 06 09 13 16 19 22 25 28 31 34 |
012 013 014 015 016 017 018 019 020 021 022 |
38 41 44 47 50 53 56 59 63 66 69 |
023 024 025 026 027 028 029 030 031 000 |
72 75 78 81 84 88 91 94 97 00 |
The second record in the above sample is a quote record on a February 45 call, with bid price of $7 1/2 (00750) and ask price of $8 (00800). The third is a trade record on a January 55 put, where at a price of 3 7/8 (00388), three contracts (00003) were traded.
3.4.7 Underlying Asset Price
For most ticker symbols, the Berkeley Options Data Base contains an exact copy of the last four digits of the MDR's underlying price field, plus an additional zero as a place-holder. Thus, the price of IBM in the above records
was $51 1/2 (05140). You may notice that some of the previous BODB documentation
describes this field as being recorded in dollars and cents. This is no longer correct--beginning with the January, 1986 data, the field is recorded in dollars and eighths for equity options.
For index options such as the OEX and SPX series, the BODB contains an exact copy of the first five digits of the MDR's underlying price field, so that an OEX value of 455.31 would be recorded as 45531. Not all underlying values for index options are recorded correctly in release 2.01 of BODB (January 1986--December 1993). In some cases, such as the SPZ overflow ticker, the version 2.01
processing program mistakenly treated the MDR record as an equity option. Consequently, the BODB contains only the last three digits of the underlying price. An SPZ value of 415.65, for example, was mistakenly recorded as 56500.
This problem has been corrected in the new version 3.0 program, so data beginning in January, 1994 are fine.
4 Historical Information
A brief account of the origin of the Berkeley Options Data Base is described in a paper by Rubinstein and Vijh (see the Bibliography, below). The Berkeley Options Data Base was created by Mark Rubinstein and others at the University of California, Berkeley, in cooperation with the Chicago Board Options Exchange, and with a grant from the National Science Foundation. Several other individuals have helped create, develop, or maintain BODB over the years, including Anand Vijh, Mihir Bhattacharaya, Mark Garman, Robert Geske, Rachid Laraqui, Frederic Sipiere, Richard Lindsay, Gail Belonsky, Rakesh Chandra, Stewart Mayhew, Kehong Wen, and Xiaoyan Ma. The Berkeley Options Data Base operates under the auspices of the Institute for Business and Economic Research at the University of California, Berkeley.
4.1 Versions
As changes occur in the MDR tapes, or as errors are discovered in the data, new versions of the BODB are sometimes released. BODB does not guarantee the accuracy of the data, and has no responsibility to replace old data when a new version is released. However, under our current policy we will replace data for
anyone who has purchased our data, at any time, and for any reason. The replacement charge is $80 per 8mm tape. Unfortunately, we can no longer sent replacement data on 9-track tapes.
Version numbers for the BODB correspond to the processing program used to convert MDR data to BODB format. It is costly and difficult for us to go back and process old MDR data with a new processing program. Consequently, different periods of the database were processed using different programs, and there are slight format differences. The current processing program, version 3.0 is written in PERL and runs under a sunOS implementation of unix system V release 4. Currrently, the version 3.0 release begins in January, 1994. Unlike previous versions, version 3 does not alter record types to indicate an underlying stock split.
Moving to the unix platform involved a lot of new programming, and we are still debugging the processing program. We are aware of one error in version 3.0: there are a few extra unidentified control records in the MDR data, which should have been discarded but were included in the BODB with no record type or with type 4. These records contain no useful information and should be disregarded. If you find any other problems with the data, please contact the database manager.
The processing program for Version 2.1, which is the current release from January 1986 through December 1993, was written in FORTRAN and REXX, and operated under VM/CMS. This and previous versions altered the record type for options whose specifications had been modified because of an underlying stock split. The procedure for affecting this modification was imperfect, however,
and not all splits were correctly specified. Rather than continuing to provide faulty split information, we elected to discontinue this practice. To obtain correct information on stock splits, you must contact the exchange and obtain the memo corresponding to the split. For more information, see the section on stock splits, below. Another problem with version 2.1 was its incorrect treatment of underlying prices for index options, descibed in the ``underlying asset'' section above. Plans are underway to reprocess this data under version 3.
Version 2.0, which covers December 1979 through December 1985, differs from version 2.1 in that (1) the usused portion of each numeric field is filled
with blank spaces instead of zeros, and (2) underlying stock prices are recorded in dollars and cents instead of dollars and eighths. Starting September 30, 1985, the market began opening at 8:30 instead of 9:00. In version 2.0, time stamps for the first half hour of trading were incorrect: transactions occuring X seconds before 9:00 A.M. were mistakenly recorded as having occurred X seconds
after 9:00 A.M. This problem only exists from September 30, 1985 through
December 31, 1985.
Version 1, which covers August, 1976 through November, 1979, has a slightly different format, with one fewer characters in the strike price and underlying asset fields.
4.2 The Consolidated Format
Until 1987, BODB offered a condensed version of the MDR file, called the consolidated format. Instead of reporting each individual record, this format summarized all trades and quotes occurring within each block of time during which the underlying asset price did not change. Each record contained the date, ticker, strike price, and time to expiration, a beginning and ending time, the underlying stock price, high and low quoted prices, a summary of all trades during the period, information on the stock prices preceding and following the record and the approximate elapsed time between these price changes, and the number of original transaction records from which the consolidated record was created. The consolidated format was discontinued due to lack of interest.
All the information in the consolidated format may be derived from
the resorted format.
4.3 The Supplemental Tape
BODB formerly offered a ``supplemental'' tape which included several daily interest rate series, dividend information on all CBOE underlying stocks, and daily closing levels of stock market indexes. This service was discontinued due to high maintenance costs. Dividend information is available from CRSP. Interest rates may be obtained from the Wall Street Journal, or else implied interest
rates may be calculated from the prices of S&P 500 index options using the put-call parity relationship. Daily index information may be purchased from the CBOE.
5. Research Guide
5.1 Why Use Transactions Data?
Transactions data is ideal for empirically testing market-microstructure models, and extremely useful for any type of research investigating bid-ask spreads, order flow, trading volume, price discovery and the lead-lag relationship between options and their underlying stocks, price discreteness, or intraday dynamics. It is also very useful in more traditional asset pricing, in that it can be used to measure the biases introduced by microstructural frictions.
In options markets, it is particularly important to recognize the severe problems associated with asynchronous data. Since option prices are so sensitive to the underlying asset price, unobserved intraday movements in the underlying price will render asynchronously recorded closing prices incomparable. In fact, it is not uncommon to observe apparent static arbitrage opportunities among reported closing prices. Results of studies based on closing option prices without correcting for changes in the contemporaneous stock price are often viewed as highly suspicious, and are unlikely to be accepted for publication.
Thus, if your research calls for daily observations of option prices, it is strongly recommended that you carefully construct your daily observations from transactions data.
5.2 Ticker Symbols
Appendix C of this document contains information to help match options to their underlying security. However, the list is constantly changing, and you may wish to update it yourself. This section describes how to obtain information on ticker symbols, and describes conventions used by the exchanges in choosing ticker symbols.
5.2.1 Obtaining Ticker Symbol Information
The best way to identify option ticker symbols is to contact the options clearing corporation and obtain their pamphlet ``Directory of Exchange Listed Options.'' This pamphlet contains the ticker symbols for options traded
on the CBOE, as well as those trading on the New York, American, Philadelphia, and Pacific stock exchanges. The pamphlet may also be obtained through the CBOE, and is also available, for a fee, in electronic form.
One drawback is that this pamphlet contains only currently traded options. For historical research, there is another list, available from the CBOE, that contains listing and delisting dates of equity options, along with some name change information that can be helpful for identifying tickers. Appendix C contains a subset of this list, containing only ticker symbols, listing and delisting dates, and company names for options trading on the CBOE. Appendix C also contains lists of ticker symbols for other options trading on the CBOE, including index options, index LEAPS, interest rate options, and equity LEAPS.
5.2.2 Ticker Symbol Conventions
The New York and American Stock Exchanges use three-letter ticker symbols to identify securities. Options on stocks traded on these exchanges generally use the same three-letter symbol. NASDAQ ticker symbols are longer than three letters, and for these stocks, options are traded using a ticker symbol consisting of two letters from the NASDAQ ticker plus the letter Q.
Index options are normally given a three letter ticker symbol ending in X, with the first two letters chosen to describe the underlying security. For instance, the ticker symbol for the S&P 500 index is SPX, And the ticker symbol for the S&P 100 index, originally known as the CBOE 100 index, is OEX.
Because options are traded using a single letter to designate the option's strike price, there is a limit to the number of different strike prices that may be assigned to a single ticker symbol. For the most popular option contracts, such as the SPX and OEX indexes, it became necessary to assign a second ticker symbol to accommodate the wide range of strike prices. Secondary ticker symbols are usually created by changing the last letter of the ticker symbol to Z. Ticker SPZ serves as the overflow symbol for SPX, and OEZ is the overflow for OEX. Sometimes, additional option classes are added that are similar to an existing class, and these are assigned ticker symbols that are as descriptive as possible. For example, options on the SPX index that expire at the end of each
quarter rather than the customary dates were given the ticker symbol SPQ.
Options with long-term expiration dates, known as LEAPS, are also assigned their own ticker symbols, which resemble the original tickers. LEAPS are traded with one expiration date per year, either in December or January depending on whether they are Equity or Index LEAPS. Generally a letter is chosen to designate all LEAPS expiring in the same year, such as ``V'' for 1995 or ``L'' for 1996,
and each LEAPS ticker symbol is created, if possible, by substituting this letter for one of the letters in the original ticker. For example, the IBM 1995 LEAPS are traded under the ticker symbol VIB, and IBM 1996 LEAPS under the symbol LIB. Sometimes the letters have to be twisted around to avoid duplicating
another ticker symbol. Ticker symbols for some Index and Equity LEAPS are listed in Appendix C. You may wish to contact the exchange for a more current list.
5.3 Stock Splits
Another reason for introducing new ticker symbols is to differentiate between options on pre- and post-split shares. Because exchange-traded options are protected against splits, the terms of the existing option contracts must be adjusted whenever a stock splits. A call option contract with a strike price of 80 gives its owner the right to buy 100 shares of the underlying stock for 80 dollars a share. If the stock then splits 2-1, the contract is adjusted so that the owner now holds a call on 200 shares with a strike price of 40. After the split, however, new options will also be written with a strike price of 40, but the newly-written options are on 100 shares. Under the standard ticker-symbol nomenclature, traders would be unable to distinguish between options written on 200 shares and those written on 100 shares. To solve this problem, the exchange introduces a secondary ticker symbol whenever there is a stock split. The split-adjusted options are assigned a new ticker symbol, usually constructed from two letters of the original ticker symbol plus the letter Z. If IBM were to split, the old options might begin trading under the ticker symbol IBZ, and newly-written options will trade under the usual ticker symbol. In addition, traders must remember that the strike prices on the old options will be spaced at unconventional strike-price intervals. After a 2-for-1 split, the old options will trade at two-and-a-half dollar increments. To avoid confusion, the exchange prepares a special memorandum whenever a stock splits, specifying exactly the 5-letter trading code for each old option, and each new option.
5.4 Option Expiration
Nearly all options expire on the Saturday following the third Friday of the expiration month. One exception are the S&P 500 end-of-quarter options (SPQ). Equity options are listed for the nearest two exercise dates, plus the next two
expiration dates in the option's expiration ``cycle.''
| Cycle | Expiration Months |
| January February March |
JAN, APR, JUL, OCT FEB, MAY, AUG, NOV MAR, JUN, SEP, DEC |
For example, at the beginning of April, a January-cycle stock will have options expiring in April, May, July, and October, a February-cycle stock will have options expiring in April, May, August, and November, and a March-cycle stock will have options expiring in April, May, Jun, and September. On the Saturday following the third Friday of April, the April options expire, and a new set of contracts is introduced (June contracts are opened for January-cycle and February-cycle stocks, and December contracts are opened for March-cycle stocks).
Some index options, such as SPX, are assigned to expiration cycles like equity options. Others, such as the OEX, are always traded for the closest four expiration dates. In addition, long-term options with maturities of up to
three years are traded on popular securities. A list of index and equity LEAPS is contained in Appendix C. These options expire once a year, in December or January.
5.5 Other Data Sources
Organized exchanges are usually required to keep a permanent record of all their transactions. Many of these exchanges have made their transactions data available to the public, or at least to academics for research purposes.
In general, the only way to acquire this data is directly from the exchange, usually on 9-track tapes.
5.5.1 Stock Market Transactions Data
Prior to January, 1993, transactions data for stocks traded in the United States were available on 9-track magnetic tapes from the Institute for the Study of Securities Markets (ISSM). Beginning in January 1993, the New York Stock Exchange instituted the ``Trades And Quotes'' (TAQ) database, containing trades and quotes for all stocks on the New York and American Stock Exchanges, as well as the Regional exchanges and NASDAQ. The TAQ database comes on PC-readable CD-ROM, with built-in access programs that run under DOS. The FORTRAN source code is included for these access programs. The data may be purchased from the New York Stock Exchange for $200/month, with each month stored on a separate CD.
In addition to the TAQ, the New York Stock Exchange has more extensive databases which are not currently available to the public, but which are often used by researchers at the exchange. A small sample of this data, known as TORQ for ``Trades, Orders, Reports, and Quotes,'' was publicly released on a single CD-ROM. This CD contains a few months of data for a small number of firms. In addition to the trade and quote data available on TAQ, this database contains the so-called ``audit trail'' data, a rich source of information about the institutional structure of order flow in the stock market.
5.5.2 Options on Futures
Options on futures are traded primarily at the Chicago Mercantile Exchange and the Chicago Board of Trade. Transactions data may be obtained from these exchanges on 9-track tapes. Especially of interest are their options on S&P index futures, bond futures, Eurodollar futures, and currency futures.
5.5.3 Currency Options
The Berkeley Options Data Base contains some data on currency options, which were traded on the CBOE during part of 1985 and 1986. European-style options were traded on the Japanese Yen, Deutchemark, British Pound, Swiss Franc, and Canadian Dollar. Due to the low trading volume on these contracts, the CBOE sold them to the Philadelphia Stock Exchange (PHLX), where they are still traded.
American-style currency options are also traded at the PHLX. For a charge of $75, academic researchers may purchase either a daily summary database on 30 floppy disks or a transactions database on one 9-track magnetic tape. The transactions database begins in 1984, and contains only trades, not quotes.
The PHLX stores their quote data on microfiche, and it is inaccessible for technical reasons.
5.5.4 How to Contact the Exchanges
American Stock Exchange
Derivative Securities
86 Trinity Place
New York, NY 10006
1-800-THE-AMEX
Chicago Board Options Exchange
LaSalle at Van Buren
Chicago, IL 60605
1-800-OPTIONS
(312) 786-5600
New York Stock Exchange
Options and Index Products
11 Wall Street
New York, NY 10005
1-800-692-6973
(212) 656-8533
The Options Clearing Corporation
440 South LaSalle Street
Suite 2400
Chicago, IL 60605
1-800-537-4258
(312) 322-6200
Pacific Stock Exchange
Options Marketing
115 Sansome Street, 7th Floor
San Francisco, CA 94104
1-800-TALK-PSE
(415) 393-4028
Philadelphia Stock Exchange
1900 Market Street
Philadelphia, PA 19103
1-800-THE-PHLX
(215) 496-5404
Chicago Board of Trade
Market Data Services [Larsenia R. Williams]
141 W. Jackson, Ste. 2313
Chicago, IL 60604-2994.
(312) 341-3163
Chicago Mercantile Exchange
Records Management [Andr'e Gibson]
30 South Wacker Drive
Chicago, IL 60606-7499
(312) 930-3178
6 Computer Programs
6.1 Extracting a List of Ticker Symbols
If you are using a unix system, it is very easy to extract a list of ticker symbols contained in a BODB file. To create a new file ticklist from BODB file res227, containing an alphabetical list of all ticker symbols in that file,
simply type the following at the unix command line:
cut -c 3-5 res227 | sort -u -o ticklist &
This command ``cuts'' out columns 3 through 5 from the file res227, and sends the result to a sorting program. The -u option instructs the sorting program to throw away duplicate observations, and -o ticklist specifies the name of the output file. The asterisk makes the program run in the background. You could use the same type of command to extract a list of record types [1-2], expiration dates [18-19], strike prices [20-25], or any other field in the data.
6.2 Extracting Columns
Suppose you are not interested in all the data fields, but only wish to extract a few fields. You can accomplish this from the unix command line using the cut and paste utilities. For example, suppose that you only want the ticker symbol, date, time, and stock price. Because the ticker, date, and time are adjacent, you can cut them out and place them in a separate file with a single command,
cut -c 3-17 res227 > firsthalf
then cut out the stock prices with a similar command,
cut -c 35-40 res227 > secondhalf
and glue them back together in a single file:
paste firsthalf secondhalf > outfile
6.3 Extracting Records
The simplest way to extract records from a BODB file is using a computer program that reads the file one line at a time, checks to see whether the line meets the selection criteria, and if so, writes the line to an output file. Simple jobs, such as extracting all records for a single ticker symbol, can easily be handled using built-in unix utilities such as grep. For example, to extract all OEX records from BODB file res193 and place them in a new file called oex193,
simply type
grep OEX res193 > oex193
at the unix command line. When extracting one- or two-letter ticker symbols
using grep, you may accidently extract unwanted symbols. For example, if you were to try grep T res193, you would not only extract ticker symbol T but also every other ticker symbol containing the letter T. To get around this problem, you must tell the grep program to look for the string T preceded by a number from 0 to 9 and followed by a space. How you specify this may depend on the syntax for regular expressions in your unix shell. In the sh shell, issue the command
grep [0-9]T[verb+ +] > t193
The grep command is limited to string comparison, and is adequate only for the simplest extraction problems. For more complicated extraction criteria, you may wish consider using a unix programming language such as sed, awk, or perl, an SQL or other database package, or a compiled language such as FORTRAN or C.
Following is a template extraction program in C. The program takes, as input from the command line, an input file name, a starting record number, and an ending record number. It reads the designated records from the input file,
and outputs them to standard output. You may insert whatever selection criteria you wish, setting the variable keepdummy=1 if you wish to output the record.
To read the entire input file, you may either modify the program or run the program with beginning record number 1 and an ending record number larger than the number of records in the file.
/* Extractor
C Program to Extract Records from the Berkeley Options Data Base
Copyright (C) Stewart Mayhew, March 1995. */
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
/* Main Function Begins Here */
main (int argc, char **argv)
{
char buf[256];
FILE *fp;
char record[60];
long date,thisdate=0,count,ibegin=1,iend;
long offsetnum, numrecs, beginp, endp;
int keepdummy;
/* Check for proper input */
if (argc != 4) {
printf("usage: %s <input file> <first record> <last record>n"
, argv[0]);
exit(0);
}
if ((fp = fopen(argv[1], "r")) == NULL) {
printf("Can't Find file %sn",argv[1]);
exit(0);
}
beginp=atol(argv[2]);
endp=atol(argv[3]);
offsetnum=(beginp-1)*41;
fseek(fp,offsetnum,SEEK_SET);
numrecs=endp-beginp+1;
/* Loop Begins Here */
for (count=1;count<=numrecs;count++) {
if (fgets(buf,256,fp)==NULL)
exit(0);
keepdummy=0;
/* Read Record */
sscanf(buf, "%41c%",
record);
record[41]='0';
/* ..........Insert Search Criteria Here...........
Set keepdummy=1 to output the record */
/* Write Record to Standard Output */
if (keepdummy) printf("%s",record);
} /* End of Loop through records */
fclose (fp);
return(0);
}
6.4 Creating an Index File
Both grep and the Extractor template program will tend to be slow, because they have to read each record sequentially. A single BODB file may contain up to 4 million records, and since you will probably be running the program multiple times on different files, it may take quite a long time to extract all the data you need. In the long run, you can save a lot of time by creating an index that stores information about where the different ticker symbols are located within the file. You will have to read the data sequentially to create the index file, but once created it will dramatically decrease extraction time.
Following is a program that reads a BODB file and writes an index to standard output. The index is simply a list of beginning and ending record numbers for each date/ticker symbol combination.
/* MakeIndex
C Program to create an index for a BODB file
Copyright (C) Stewart Mayhew, March 1995 */
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
/* Main Function Begins Here */
main (int argc, char **argv)
{
char buf[256];
FILE *fp;
char rtype[10];
char ticker[10];
char thisticker[10];
char datebuf[10];
long date,thisdate=0,count,ibegin=1,iend;
/* Check for proper input */
if (argc != 2) {
printf("usage: %s <input file> n", argv[0]);
exit(0);
}
if ((fp = fopen(argv[1], "r")) == NULL) {
printf("Can't Find file %sn",argv[1]);
exit(0);
}
/* Loop Begins Here */
for (count=1;;count++) {
if (fgets(buf,256,fp)==NULL)
break;
/* Read Record */
sscanf(buf, "%2c%3c%6c",
rtype, ticker, datebuf);
rtype[2]='0';
ticker[3]='0';
datebuf[6]='0';
date=atol(datebuf);
/* Check to see if there is a new ticker. If so, output index info. */
if ((strcmp(ticker,thisticker)!=0) || (date != thisdate)) {
iend=count-1;
if (iend>0)
printf("%ld %s %ld %ldn",thisdate,thisticker,ibegin,iend);
ibegin=count;
strcpy(thisticker,ticker);
thisdate=date;
}
} /* End of Loop through records */
if (count>=1) {
iend=count-1;
if (iend>0)
printf("%ld %s %ld %ldn",thisdate,thisticker,ibegin,iend);
}
fclose (fp);
return(0);
}
6.5 Using the Index File to Extract Data
This section contains a unix shell program and a modified version of the template Extractor program, that together with an index file, may be used
to extract data from the Berkeley Options Data Base. The shellscript reads in a list of ticker symbols from the file ``bodbread.in'' and a list of BODB filenames from the file ``bodbread.files.'' For each BODB file resXXX, it assumes there exists an index file ``index.resXXX,'' created by the program MakeIndex above. For each file named in bodbread.files, the shellscript creates a temporary extraction file called ``templist'' by grepping the appropriate lines out of the index file. Then, it calls the C program ``IndexExtractor,'' which extracts the specified records from the BODB file.
To extract data, modify the IndexExtractor program as you wish, compile the code using an ANSI-C compiler, create the input files bodbread.in and bodbread.data, then run the shell script:
#!/bin/sh
XX=`cat bodbread.in`
YY=`cat bodbread.files`
for Y in $YY
do
for X in $XX
do
egrep [ ]$X[ ] index.$YY | cut -c 12-26 >>templist
done
IndexExtractor $YY templist > bodbread.out
rm templist
done
Here is the code for the Extraction Program:
/* IndexExtractor
C Program to Extract Data from an indexed BODB file
Copyright (C) Stewart Mayhew, March 1995.
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
/* Main Function Begins Here */
main (int argc, char **argv)
{
char buf[256];
char buftix[256];
FILE *fp;
FILE *fp2;
char record[60];
long date,thisdate=0,count,ibegin=1,iend;
long offsetnum, numrecs, beginp, endp;
int loop;
/* Check for proper input */
if (argc != 3) {
printf("usage: %s <datafile> <extractionfile>n"
, argv[0]);
exit(0);
}
if ((fp = fopen(argv[1], "r")) == NULL) {
printf("Can't find data file %sn",argv[1]);
exit(0);
}
if ((fp2 = fopen(argv[2], "r")) == NULL) {
printf("Can't find extraction file %sn",argv[2]);
exit(0);
}
for (loop=1;;loop++) {
if (fgets(buftix,256,fp2)==NULL) exit(0);
sscanf(buftix, "%ld %ld", &beginp, &endp);
offsetnum=(beginp-1)*41;
fseek(fp,offsetnum,SEEK_SET);
numrecs=endp-beginp+1;
/* Inner Loop Begins Here */
for (count=1;count<=numrecs;count++) {
if (fgets(buf,256,fp)==NULL)
exit(0);
/* Read Record */
sscanf(buf, "%41c%",record);
record[41]='0';
printf("%s",record);
} /* End of Loop through records */
}
fclose (fp);
fclose (fp2);
return(0);
}
6.6 Economizing Storage Space
You can decrease the amount of space required to store BODB files by a factor of about 8:1 using a simple compression program. The unix compress program will work fine for this purpose, but we recommend using gzip, which is nearly as universal as compress but uses a more efficient compression algorithm.
In addition, you can reduce storage space by reducing the amount of redundant information in the database. For example, each record contains the date and ticker symbol. If you use the indexing program suggested above, dates and ticker symbols can be recovered from the index file, so you can remove nine characters [3-11] from each record, reducing storage size by 1/4.
If you resort the data by option series, you can modify the index program to create one entry for each series. This will greatly increase the size of the index file, but will save you another eight characters per record. In the vast majority of cases, the last two digits in the strike price field are ``00''. Exceptions are for options on low-priced stocks with strikes separated by $2.50, and options on stocks which have recently split. If you are storing a subset of data that only contains even-dollar-incremented strike prices, you can remove these two characters. If you are storing only equity options, you can remove character [40], which is always zero. If you try hard enough you should be able to reduce a 150 megabyte BODB file to about 10 megabytes. Please be sure to carefully document all formatting changes you make, and be sure never to disturb the data on the original tapes.
Appendix
A.1 Bibliography of Papers Using Options Transaction Data
A large number of published articles and working papers have used BODB or MDR data to study securities pricing, market microstructure, and other similar topics. Here is a partial list of these many papers. [This section of the user's guide is still under construction.]
Aggarwal, Raj and Edward Gruca, ``Intraday Trading Patterns in the Equity Options Markets,'' Journal of Financial Research v14 n4 (Winter 1993): 285-297.
(Examines intraday patterns in Volume, proportion of small trades proportion of transactions on upticks, quoted price levels, and bid-ask spreads in the options market.)
[Data: BODB (Jul-Dec, 1986)]
Ancel, Esther Weinstock and Ramash K. S. Rao, ``Stock Returns and Option Prices: An Exploratory Study,'' Journal of Financial Research v13 n3 (Fall 1990): 173-185.
(Uses Options Data to back out implied parameters of an option pricing model)
[Data: BODB (Feb-Jul, 1979)]
Bhattacharya, Mihir, ``Empirical Properties of the Black-Scholes option-pricing Formula Under Ideal Conditions,'' Journal of Financial and Quantitative Analysis v15 n5 (Dec 1980): 1081-1105.
(Uses stock returns data on CBOE traded options to test whether the discretely-rebalanced Black-Scholes hedging strategy truly replicates the option)
Bhattacharya, Mihir, ``Transactions Data Tests of Efficiency of the Chicago Board Options Exchange,'' Journal of Financial Economics v12 (1983): 161-185.
(Uses transactions data to test the arbitrage boundary conditions imposed by rational option pricing and the uniformity of implied volatilities across options imposed by the Black-Scholes model)
[Data: BODB (Aug 1976-Jun 1977)]
Chan, Kalok, Y. Peter Chung and Herb Johnson, ``Why Option Prices Lag Stock Prices: A Trading-based Explanation,'' Journal of Finance v48 n5 (Dec 1993): 1957-1967.
(Finds that stock prices lead option prices and attributes this to the larger relative tick size in option markets)
[Data: BODB (Jan-Mar, 1986)]
Diz, Fernando, ``Long and Short-Run Dynamics of Volatility Formation in the S&P 100 Index Options Market: An Empirical Examination,'' Working Paper (1993),
[Data: MDR OEX (1985-1988)]
Diz, Fernando and Thomas J. Finucane. ``Index Options Expirations and Market Volatility,'' Working Paper (1994).
(Examines index option volatility near expiration dates)
[Data: MDR OEX (1985-1988)]
Diz, Fernando and Thomas J. Finucane, ``The Rationality of Early Exercise Decisions: Evidence from the S&P Index Options Market,'' Review of Financial Studies v6 n4 (Winter 1993): 765-797.
[Data: MDR OEX (Apr 1983-Dec 1988)]
Diz, Fernando and Thomas J. Finucane, ``The Time Series Properties of Implied Volatility of S&P Index Options,'' Journal of Financial Engineering (June 1993).
[Data: MDR OEX (Jan 1984-Aug 1987)]
Frankfurter, George M. and Wai K. Leung, ``Further Analysis of the Put-Call Parity Implied Risk-Free Interest Rate,'' Journal of Financial Research v14 n3 (Fall 1991): 217-232.
(Uses option prices to back out the interest rate implied by the put-call parity relationship, and compares the implied rates with T-bill rates)
[Data: BODB (5 stocks, 1982-1983]
Kamara, Avraham and Thomas W. Miller, Jr., ``Daily and Intradaily Tests of European Put-Call Parity,'' Journal of Financial and Quantitative Analysis v30 n4 (Dec 1995): 519-539.
(Tests the put-call parity relationship for SPX options)
[Data: BODB (SPX Jan-Mar 1989)]
Mayhew, Stewart, Atulya Sarin and Kuldeep Shastri, ``The allocation of informed trading across related markets: An analysis of the impact of changes in equity-option margin requirements,'' Journal of Finance v50, n5 (Dec 1995):1635-1653.
(Examines the effect of option margin requirements on the underlying
stock market.)
Peterson, David R., ``A Transaction Data Study of Day-of-the-Week and Intraday Patterns in Options Returns,'' Journal of Financial Research v13 n2 (Summer 1990): 117-131.
(Examines intertemporal patterns in options returns)
[Data: BODB (consolidated, 1983-1985)]
Rubinstein, Mark and Anand M. Vijh. ``The Berkeley Options Data Base: A Tool for Empirical Research,'' Advances in Futures and Options Research v2 (1987): 209-221.
(Describes the Berkeley Options Data Base)
Sheikh, Aamir M. and Ehud I. Ronn, ``A Characterization of the Daily and Intra-day Behavior of Returns on Options,'' Journal of Finance v49 n2 (Jun 1994): 557-579.
(Examines intertemporal patterns in options returns, correcting for changes in underlying stock prices)
[Data: BODB (Jan 1986-Sep 1987)]
A.2 List of BODB Files
| Filename | Dates | Filename | Dates |
| res01 res02 res03 res04 res05 res06 res07 res08 res09 res10 res11 res12 res13 res14 res15 res16 res17 res18 res19 res20 res21 res22 res23 res24 res25 res26 res27 res28 res29 res30 res31 res32 res33 res34 res35 res36 res37 res38 res39 res40 res41 res42 res43 res44 res45 res46 res47 res48 res49 res50 res51 res52 |
760823761119 761122770218 770222-770520 770523-770819 770822-771021 771024-771230 Jan-Feb 1978 Mar-Apr 1978 May--Jun 1978 Jul--Aug 1978 Sep--Dec 1978 Jan--Mar 1979 Apr--Jul 1979 Aug--Sep 1979 Oct--Nov 1979 Dec 1979 Jan--Feb 1980 Mar--Apr 1980 May--Jun 1980 Jul--Aug 1980 Sep 1980 Oct 1980 Nov--Dec 1980 Jan--Feb 1981 Mar 1981 Apr 1981 May 1981 Jun 1981 Jul 1981 Aug 1981 Oct 1981 Nov 1981 Dec 1981 Jan 1982 Feb 1982 Mar 1982 Apr 1982 May 1982 Jun 1982 Jul 1982 Aug 1982 Sep 1982 Oct 1982 Nov 1982 Dec 1982 Jan 1983 Feb 1983 Mar 1983 Apr 1983 May 1983 Jun 1983 Jul 1983 |
res53 res54 res55 res56 res57 res58 res59 res60 res61 res62 res63 res64 res65 res66 res67 res68 res69 res70 res71 res72 res73 res74 res75 res76 res77 res78 res79 res80 res81 res82 res83 res84 res85 res86 res87 res88 res89 res90 res91 res92 res93 res94 res95 res96 res97 res98A res98B res99A res99B res100A res100B |
Aug 1983 Sep 1983 Oct 1983 Nov 1983 Dec 1983 Jan 1984 Feb 1984 Mar 1984 Apr 1984 May 1984 Jun 1984 Jul 1984 Aug 1984 Sep 1984 Oct 1984 Nov--Dec 1984 Jan 1985 Feb 1985 Mar--Apr 1985 May--Jun 1985 Jul 1985 Aug 1985 Sep 1985 Oct 1985 Nov 1985 Dec 1985 Jan 1986 Feb 1986 Mar 1986 Apr 1986 May 1986 Jun 1986 Jul 1986 Aug 1986 Sep 1986 Oct 1986 Nov 1986 Dec 1986 Jan 1987 Feb 1987 Mar 1987 APR 1987 MAY 1987 Jun 1987 Jul 1987 Aug I 1987 Aug II 1987 Sep I 1987 Sep II 1987 Oct I 1987 Oct II 1987 |
| Filename | Dates | Filename | Dates | Filename | Dates |
| res101 res102 res103 res104 res105 res106 res107 res108 res109 res110 res111 res112 res113 res114 res115 res116 res117 res118 res119 res120 res121 res122 res123 res124 res125 res126 res127 res128 res129 res130 res131 res132 res133 res134 res135 res136 res137 res138 res139 res140 res141 res142 |
Nov 1987 Dec 1987 Jan 1988 Feb 1988 Mar 1988 Apr 1988 May 1988 Jun 1988 Jul 1988 Aug 1988 Sep 1988 Oct 1988 Nov 1988 Dec 1988 Jan 1989 Feb 1989 Mar 1989 Apr 1989 May 1989 Jun 1989 Jul 1989 Aug 1989 Sep 1989 Oct 1989 Nov 1989 Dec 1989 Jan 1990 Feb 1990 Mar 1990 Apr 1990 May 1990 Jun 1990 Jul I 1990 Jul II 1990 Aug I 1990 Aug II 1990 Sep I 1990 Sep II 1990 Oct I 1990 Oct II 1990 Nov I 1990 Nov II 1990 |
res143 res144 res145 res146 res147 res148 res149 res150 res151 res152 res153 res154 res155 res156 res157 res158 res159 res160 res161 res162 res163 res164 res165 res166 res167 res168 res169 res170 res171 res172 res173 res174 res175 res176 res177 res178 res179 res180 res181 res182 res183 res184 |
Dec 1990 Jan I 1991 Jan II 1991 Feb I 1991 Feb II 1991 Mar I 1991 Mar II 1991 Apr I 1991 Apr II 1991 May I 1991 May II 1991 Jun I 1991 Jun II 1991 Jul I 1991 Jul II 1991 Aug I 1991 Aug II 1991 Sep I 1991 Sep II 1991 Oct I 1991 Oct II 1991 Nov I 1991 Nov II 1991 Dec I 1991 Dec II 1991 Jan I 1992 Jan II 1992 Feb I 1992 Feb II 1992 Mar I 1992 Mar II 1992 Apr I 1992 Apr II 1992 May I 1992 May II 1992 Jun I 1992 Jun II 1992 Jul I 1992 Jul II 1992 Aug I 1992 Aug II 1992 Sep I 1992 |
res185 res186 res187 res188 res189 res190 res191 res192 res193 res194 res195 res196 res197 res198 res199 res200 res201 res202 res203 res204 res205 res206 res207 res208 res209 res210 res211 res212 res213 res214 res215 res216 res217 res218 res219 res220 res221 res222 res223 res224 res225 res226 res227 |
Sep II 1992 Oct I 1992 Oct II 1992 Nov I 1992 Nov II 1992 Dec I 1992 Dec II 1992 Jan I 1993 Jan II 1993 Jan III 1993 Feb I 1993 Feb II 1993 Mar I 1993 Mar II 1993 Mar III 1993 Apr I 1993 Apr II 1993 Apr III 1993 May I 1993 May II 1993 May III 1993 Jun I 1993 Jun II 1993 Jun III 1993 Jul I 1993 Jul II 1993 Jul III 1993 Aug I 1993 Aug II 1993 Aug III 1993 Sep I 1993 Sep II 1993 Sep III 1993 Oct I 1993 Oct II 1993 Oct III 1993 Oct IV 1993 Nov I 1993 Nov II 1993 Nov III 1993 Dec I 1993 Dec II 1993 Dec III 1993 |
Starting from 1994, the naming convention has been changed to the following simpler form.
| Filename | Dates | Filename | Dates |
| res94001 res94002 res94003 res94004 ... res94046 |
res95001 res95002 res95003 res95004 ... res95106 |
A.3 Ticker Symbol Identification
A.3.1 Ticker Symbols for CBOE Index Options
| Ticker | Index | Exercise Style |
| OEX OEZ CPO SPX SPZ NSX SPL SPQ CPS BIX BGX CEX CWX EVX GAX GTX HCX IUX RIX RLX TCX TRX FSX ISX MEX MZX NIK NDX RUT SGX SVX |
S&P 100
Index S&P 100 Index - OEX strike overflow S&P 100 Index - CAPS S&P 500 Index S&P 500 Index - SPX strike overflow S&P 500 Index - PM Expiration S&P 500 Index - Long-Dated S&P 500 Index - End-of-Quarter S&P 500 Index - CAPS S&P Banking Index CBOE BioTech Index S&P Chemical Index CBOE Computer Software Index CBOE Environmental Index CBOE Gaming Index CBOE Global Telecommunications Index S&P Health Care Index S&P Insurance Index CBOE REIT Index S&P Retail Index CBOE U. S. Telecommunications Index S&P Transportation Index FT-SE 100 Index CBOE Israel Index CBOE Mexico Index CBOE Mexico Index (MEX strike overflow) Nikkei 300 Index NASDAQ 100 Index Russell 2000 Index S&P/Barra Growth Index S&P/Barra Value Index |
American European European European European European European European European European European European European European European European European European European European European European European European European European European European |
A.3.2 Ticker Symbols for Index LEAPS
Ticker Index Expiration
OAX S&P 100 Index 1993
OBX S&P 100 Index 1994
OLX S&P 100 Index 1992, 1995
OCX S&P 100 Index 1996
LSW S&P 500 Index 1993
LSY S&P 500 Index 1994
LSX S&P 500 Index 1992, 1995
LSZ S&P 500 Index 1996
WRU Russell 2000 Index 1994
VRU Russell 2000 Index 1995
LRU Russell 2000 Index 1996
WBG CBOE BioTech Index 1994
VBG CBOE BioTech Index 1995
Ticker Index Expiration
LBG CBOE BioTech Index 1996
VEX CBOE Mexico Index 1995
VNX Nikkei 300 Index 1995
A.3.3 Ticker Symbols for Interest Rate Options
| Ticker | Underlying |
| IRX VXB LXB FVX VXV LXV TNX VXN LXN TYX VYY LTY LTX |
13-week T-bill 13-week T-bill (1995 LEAP) 13-week T-bill (1996 LEAP) 5-year Note 5-year Note (1995 LEAP) 5-year Note (1996 LEAP) 10-year Note 10-year Note (1995 LEAP) 10-year Note (1996 LEAP) 30-year Bond 30-year Bond (1995 LEAP) 30-year Bond (1996 LEAP) Weighted Average Long-Term Rate (discontinued) |
A.3.4 Ticker Symbols for CBOE Equity Options
(As of July 28, 1993)
Ticker |
Listed |
Delisted |
Company Name |
| KKQ | 921006 |
ACCLAIM ENTERTAINMENT, INC. | |
| ACT | 881027 |
ACTUA GROUP | |
| ADT | 911021 |
ADT LIMITED | |
| AVQ | 920508 |
ADVANTA CORPORATION CL. A | |
| ABQ | 930313 |
ADVANTA CORP., CLASS B | |
| AFP | 881219 |
AFFILIATED PUBLICATIONS INC. | |
| AFQ | 920618 |
AFFYMAX N.V. | |
| AQG | 930709 |
AGNICO-EAGLE MINES LTD. | |
| ABF | 901030 |
AIRBORNE FREIGHT CORPORATION | |
| ALC | 930423 |
ALC COMMUNIATIONS CORP. | |
| ALA | 921204 |
ALCATEL ALSTHOM ADR | |
| AAL | 840221 |
ALEXANDER & ALEXANDER SERVICES | |
| AEG | 750922 |
ALLEGIS | |
| AYQ | 911121 |
ALLIANCE PHARMACEUTICAL CORP. | |
| ATK | 910301 |
ALLIANT TECHSYSTEMS INC | |
| ALS | 841226 |
861231 |
ALLIED STORES |
| AA | 741217 |
ALUMINUM CO. OF AMERICA | |
| AU | 930503 |
AMAX GOLD INC. | |
| AMH | 810629 |
AMDAHL CORPORATION | |
| AEP | 750523 |
AMERICAN ELECTRIC POWER CO. | |
| AXP | 770518 |
AMERICAN EXPRESS CO. | |
| AGC | 850730 |
AMERICAN GENERAL | |
| AGQ | 850603 |
910722 |
AMERICAN GREETINGS |
| AHS | 750623 |
851125 |
AMERICAN HOSPITAL SUPPLY |
| AIT | 850813 |
AMERICAN INFO TECHNOLOGY | |
| AIG | 841022 |
AMERICAN INT'L GROUP | |
| PWQ | 910816 |
AMERICAN POWER CONVERSION CORP | |
| ASC | 880810 |
AMERICAN STORES | |
| T | 730426 |
AMERICAN TELEPHONE AND TELEGRA | |
| AIT | 850813 |
AMERITECH | |
| ATQ | 900123 |
920629 |
AMER. T.V. & COMMUNICATIONS CL |
| AN | 750624 |
AMOCO | |
| AMP | 750926 |
AMP INCORPORATED | |
| APC | 870420 |
ANADARKO PETROLEUM | |
| AQN | 930503 |
ANDREW CORP. | |
| APA | 800725 |
880915 |
APACHE CORPORATION |
| APQ | 850603 |
860703 |
APOLLO COMPUTER INC. |
| AAQ | 850603 |
910521 |
APPLE COMPUTER INC. |
| APM | 881219 |
APPLIED MAGNETICS CORP. | |
| ARA | 841022 |
841219 |
ARA SERVICES |
| OIQ | 920928 |
ARTISOFT, INC. | |
| ARC | 730426 |
ATLANTIC RICHFIELD CORP. | |
| AIQ | 920319 |
ATLANTIC SOUTHEAST AIRLINES | |
| AQT | 930604 |
ATMEL CORPORATION | |
| URQ | 920131 |
AURA SYSTEMS, INC. | |
| TQO | 930707 |
AUTOTOTE (CLASS A) | |
| AZO | 920127 |
AUTOZONE, INC. | |
| AVP | 730801 |
AVON PRODUCTS INC. | |
| AQR | 930406 |
AZTAR CORP. | |
| JBQ | 920928 |
BAKER (J), INC. | |
| BLY | 930709 |
BALLY MANUFACTURING CORPORATIO | |
| BLY | 770301 |
910521 |
BALLY MANUFACTURING CORP. |
| BDG | 930201 |
BANDAG, INC. | |
| BK | 881219 |
BANK OF NEW YORK COMPANY, INC. | |
| BKQ | 930218 |
BANK SOUTH CORP. | |
| BAC | 760701 |
BANKAMERICA CORPORATION | |
| BLH | 930701 |
BANKERS LIFE HOLDING CORP. | |
| BNQ | 930210 |
BANYAN SYSTEMS INC. | |
| BTI | 921204 |
BAT INDUSTRIES ADR | |
| BMG | 870717 |
BATTLE MOUNTAIN GOLD CORP. | |
| BAX | 750523 |
BAXTER INTERNATIONAL, INC. | |
| BBQ | 900417 |
BAYBANKS, INC. | |
| BCE | 901204 |
920118 |
BCE INC. |