| Conceptual Search | Keyword Search | Boolean Search | Proximity Search | Using Wildcards |

How to use EDGAR Full Text Search

EDGAR Full Text Search makes use of Autonomy's conceptual search tool. This tool provides a capability similar to natural language processing, and thus avoids many of the limitations found when using simple keyword searches. The search tool also takes into account the context in which a term appears. This eliminates many potential false hits; at the same time, it catches documents that may include the "concept" associated with a term, even though those docments may not include the specifically queried term.

This Quick Reference accompanies the EDGAR Full Text Search tool that describes the various ways web users can search for EDGAR filings submitted through SEC. Not all forms files with the Commission by public companies will be availble for searching. Please see in the FAQ section for an available list of form types.

Basic Terminology

    Precedence


    In the context of search term expressions, it means which expressions and/or Boolean operators have priority.

    Search terms


    The "search terms" refers to individual words, entire phrases, or even a small paragraph that best describes what you want to look for, to be entered in the search box.

    Stop-words


    Stop words are words that are so common that they would be useless in a search and therefore are usually ignored. For example in English, "a", "the", "and", etc...

    Relevance


    When performing a search, EDGAR Full Text Search will return a result set ranked or ordered by relevance. This means that those documents that are more relevant or fit your search terms better, contextually, will be listed first in the list.

    Word Stemming or Stemming


    The EDGAR Full Text Search tool uses stemming to find relevant results. This means that when you enter search terms, the tool stems (i.e. cuts off) at the word level so that it can find matches that are similar or very close to what you're looking for, but not necessarily exact. For example, if we're looking for [walking], it will find results that match [walk, walking, walker, walked, walks] all stemming from walk.

Conceptual Search

    EDGAR Full Text Search uses Autonomy's advanced pattern-matching technology to conceptually match search terms against the data it holds. You can submit natural language text or search terms or a piece of content (NOT enclosed in quotation marks) and it returns references to conceptually related documents ranked by relevance or contextual distance.

    In other words, EDGAR Full Text Search takes the search terms and identifies the similarities between the pieces of information. In this case, it does not do an exact match, but rather matches any of the terms provided.

    Example 1:

    Conceptual Search Example 1

    In Example 1, if you want to look for the search terms [a company that has any affiliation with], EDGAR Full Text Search will return a result set with filings that contain any of the main concepts (e.g. company, affiliation, etc) or terms.

Keyword Search

    The EDGAR Full Text Search tool implements an advanced keyword search mechanism that enables it to match any term or phrase that appears in quotation marks in its exact pre-stemmed form. In other words, when doing keyword searches, stemming is turned off.

    Example 2:

    Keyword Search Example 2 In Example 2, if you want to look for "Ccorporation affiliate", EDGAR Full Text Search will find those filings that match the exact phrase or word(s) in quotations marks. If it cannot find an exact match, the tool will work as a conceptual search and find those that more relevant.

    Phrases are stemmed and then matched. Any stop-words that the phrases contain are removed before matching and any punctuation that a phrase contains is ignored.

Boolean Search

    You can apply Boolean operators (AND, OR, NOT, etc...) to search terms in order to construct more advanced query for more specific results.

    NOTE: Boolean operators must be specified using capital letters.

  • AND - All words must appear in the same document.
  • NOT - All words appearing after the NOT may not appear in the document.
  • OR - One or all search terms must appear in the document .
  • EOR or XOR - Only one of search terms must appear in the document .
  • Multiple operators can be combined using parentheses.


  • Example 3:

    Boolean Search Example 3

    In Example 3, the query returns documents that contain both words "google" and "yahoo".

    NOTE: If you want to use NOT to exclude multiple terms, you need to use brackets. Otherwise, NOT only applies to the term that immediately follows it. If you want to use NOT to exclude a phrase, you need to put the phrase in brackets.

    Example 4:

  • Document 1 contains the phrase: "I went to the city for the New Year"
  • Document 2 contains the phrase: "I went to New York City for the New Year"


  • Boolean Search Example 4

    In Example 4, Neither Document 1 nor Document 2 are matched, because this query says: find documents that contain the search term "city" but that do not contain the search terms "New" or "York".

    Example 5:

    Boolean Search Example 5

    In Example 5, Only Document 1 but not Document 2 is matched, because this query says: find documents that contain the search term "city" but that do not contain the search term "New York".

Proximity Search

    A proximity search improves the search constraints by using proximity operators (NEAR, DNEAR, WNEAR, BEFORE, etc..) which allows you to give words that appear close together in the search string a higher weighting.

    NOTE: Proximity operators must be specified using capital letter.

  • NEARn - NEAR operator Only returns documents in which the second terms is within n words of the first term. If you do not specify n, NEAR defaults to 6.
  • DNEARn - Directed NEAR operator only returns documents in which the second terms is within n words of the first term, in the specified order. If you don't specify n, DNEAR defaults to 6.
  • WNEARn - Weighted NEAR operator, is a proximity operator that promotes relevance when spacing is less than the specified n word distance (closer together implies higher relevance). If you don't specify n, WNEAR defaults to 6.
  • BEFORE - BEFORE operator only returns documents in which the first term precedes the second one.
  • AFTER - AFTER operator only returns documents in which the first term appears later than the second one.


  • Example 6:

    Proximity Search Example 6

    In Example 6, the search only returns documents in which the term "yahoo" is no more than 1 word away from the word "google".

    Example 7:

    Proximity Search Example 7

    In Example 7, the search only returns documents in which the term "yahoo" follows the term "google", but is no more than 1 word away form the term "google".

    Example 8:

    Proximity Search Example 8

    In Example 8, in this search extra relevance is given to documents in which "yahoo" and "google" appear within 7 words of each other in a piece of text. This weight increases as the terms get closer to each other.

    Example 9:

    Proximity Search Example 9

    In Example 9, the search only returns documents in which the term "google" appears later than the term "Yahoo".

    Example 10:

    Proximity Search Example 10

    In Example 10, the search only returns documents in which the term "yahoo" appears later than the term "Google".

Using Wildcards

    You can use the following wildcards in the search terms:

  • ? - to match one character
  • * - to match zero, one or more characters.


  • Wildcard matching is done after stemming has taken place; this means that if you're searching for a term "rollersk*", system will return documents that contain any terms that have been stemmed to "rollersk", like "rollerskating", "rollerskater", "rollerskate", "rollerskates", etc...

    Example 11:

    Using ? to match a character:

    Wilcard Example 11

    In Example 11, the search returns documents that contain the term "Mikrotech" or "Microtech".

    Example 12:

    Using * to match zero, one or more characters:

    Wilcard Example 12

    In Example 12, , the search returns documents that contain "submit", "submitted", "submission", etc...