U.S. Securities & Exchange Commission
SEC Seal
Home | Previous Page
U.S. Securities and Exchange Commission

EDGAR News:
Restructuring of the Website EDGAR Database:
Information for FTP Users

Because of the recent EDGAR 7.0 release, we will make changes to the SEC Public EDGAR website's file system structure on Wednesday, October 4, 2000.

These changes won't affect most investors and other individuals who use EDGAR to search for and download corporate filings. You can still search for filings the way you always have. And you can still use your old bookmarks. But you might notice that the URL, or on-line address, looks different after October 4th.

These changes will primarily affect our FTP users, who use the File Transfer Protocol to download filings in bulk. What follows are the details of these changes for FTP users. We encourage all FTP users to carefully read this information and hope that it will help ease your transition to the new system.

Before EDGAR release 7.0, public EDGAR filings were available in a directory structure

/edgar/data/<CIK>/<FILINGS>

where <CIK> was a 1-10 digit number that uniquely identified the company or filing entity and <FILINGS> were the physical copies of the filing contents identified by the accession number embedded in the filenames.

The EDGAR 7.0 release allows filers to assign their own filenames for filing sub-documents (which may be PDF, GIF, JPG, HTML, or text format), which can be referenced in the main filing document (or in subsequent filings) via hyperlinks. Because multiple filings were previously stored in the <CIK> directory, it became possible for filers to inadvertantly overwrite sub-documents of an earlier filing with the contents of a later one, potentially damaging the integrity of the previous filing.

To prevent this, filings will now be stored in accession number directories beneath the <CIK> directory structure. This accession directory structure will be represented in two ways. Because the accession number is of the format

0123456789-AB-CDEFGH

and we wish to avoid certain limitations and peculiarities of our operating system, a directory sutructure that conforms to the following has been used to physically store the filings:

/edgar/data/<CIK>/<HGF>/<EDC>/<BA9>/<876>>/<543>/<210>/<FILING>

Please note that the <HGF>, etc. corresponds to the character positions identified in the accession number format shown above. We have reversed the order of the characters in the accession number to create the multi-level directory structure. Realizing that this optimized structure will be very unfriendly to most users, we have also created a symbolic directory reference to this structure that is the recommended method for accessing filings contents. This symbolic reference follows the pattern

/edgar/data/<CIK>/<0123456789ABCDEFGH>/<FILING>

This format preserves the original order of the accession number with the hyphens removed.

Pre-EDGAR 7.0 filings were composed of the following documents:

/edgar/data/CIK/0123456789-AB-CDEFGH.txt

(The raw ASCII text version of the entire filing with embedded subdocuments in uuencoded format)

/edgar/data/CIK/0123456789-AB-CDEFGH-index.html

(An HTML version of the filing contents with individual subdocuments hyperlink referenced)

/edgar/data/CIK/0123456789-AB-CDEFGH.hdr.sgml

(The header information in EDGAR SGML format)

/edgar/data/CIK/0123456789-AB-CDEFGH-d#.<ext>

(The individual subdocuments extracted from the raw filing. The # sign is a sequence number and the <ext> is one of the filename extensions txt, pdf, htm, or html)

All filings before May 26, 2000 remain in the original format and directory structure.

Post-EDGAR 7.0 filings, from May 26, 2000, to the present, are composed of the following documents:

/edgar/data/CIK/0123456789ABCDEFGH/0123456789-AB-CDEFGH.txt

(The raw ASCII text version)

/edgar/data/CIK/0123456789ABCDEFGH/0123456789-AB-CDEFGH-index.htm

(The HTML version with hyperlink references. PLEASE NOTE: the extension .htm is not a misprint and is significant. Identifies Post-EDGAR 7.0 filings)

/edgar/data/CIK/0123456789ABCDEFGH/0123456789-AB-CDEFGH.hdr.sgml

(The SGML header contents)

/edgar/data/CIK/0123456789ABCDEFGH/0123456789-AB-CDEFGH-####.<ext>

(Extracted subdocuments that were not explicitly named by the filer. The #### is a 4 digit sequence number.)

/edgar/data/CIK/0123456789ABCDEFGH/.<ext>

(Extracted subdocuments that were explicitly named by the filer.)

To lessen the impact of the changes for website (http) users, the following symbolic files have also been created for post-EDGAR 7.0 filings:

/edgar/data/CIK/0123456789-AB-CDEFGH.txt
(represents the raw text file physically stored in the accession directory structure.)
/edgar/data/CIK/0123456789-AB-CDEFGH-index.htm
(represents the HTML version of the filing physically stored in the accession directory structure.)

Note

Because we needed to use symbolic directory references, FTP users may need to issue change directory commands for each directory in turn, instead of changing directory using the full path in one command. For example, the following syntax may not work correctly:=:

cd /edgar/data/<CIK>/<ACCNO-NO-DASHES>
get /edgar/data/<CIK>/<ACCNO-NO-DASHES>/<FILENAME>

Instead, issue the following command sequence:

cd edgar
cd data
cd <CIK>
cd <ACCNO-NO-DASHES>
get <FILENAME>

We apologize for any hardship or confusion this may cause for our bulk FTP users. Questions may be directed to

webmaster@sec.gov

Webmaster will forward the questions to the appropriate technical staff. We recommend that you include phone contact information as well as your e-mail address in the body of the message.

http://www.sec.gov/info/edgar/ednews/restructure.htm


Modified:09/28/2000