| Conceptual Search | Keyword Search | Boolean Search | Proximity Search | Using Wildcards |
How to use EDGAR Full Text Search
EDGAR Full Text Search makes use of Autonomy's conceptual search tool. This tool provides a capability similar to natural
language processing, and thus avoids many of the limitations found when using simple keyword searches. The search tool also takes
into account the context in which a term appears. This eliminates many potential false hits; at the same time, it catches documents
that may include the "concept" associated with a term, even though those docments may not include the specifically queried term.
In the context of search term expressions, it means which expressions and/or Boolean operators have priority.
The "search terms" refers to individual words, entire phrases, or even a small paragraph that best describes what you want to look for, to be entered in the search box.
Stop words are words that are so common that they would be useless in a search and therefore are usually ignored. For example in English, "a", "the", "and", etc...
When performing a search, EDGAR Full Text Search will return a result set ranked or ordered by relevance. This means that those documents that are more relevant or fit your search terms better, contextually, will be listed first in the list.
Word Stemming or Stemming
The EDGAR Full Text Search tool uses stemming to find relevant results. This means that when you enter search terms, the tool stems (i.e. cuts off) at the word level so that it can find matches that are similar or very close to what you're looking for, but not necessarily exact. For example, if we're looking for [walking], it will find results that match [walk, walking, walker, walked, walks] all stemming from walk.
EDGAR Full Text Search uses Autonomy's advanced pattern-matching technology to conceptually match search terms against the data it holds. You can submit natural language text or search terms or a piece of content (NOT enclosed in quotation marks) and it returns references to conceptually related documents ranked by relevance or contextual distance.
The EDGAR Full Text Search tool implements an advanced keyword search mechanism that enables it to match any term or phrase that appears in quotation marks in its exact pre-stemmed form. In other words, when doing keyword searches, stemming is turned off.
You can apply Boolean operators (AND, OR, NOT, etc...) to search terms in order to construct more advanced query for more specific results.
In Example 3, the query returns documents that contain both words "google" and "yahoo".
NOTE: If you want to use NOT to exclude multiple terms, you need to use brackets. Otherwise, NOT only applies to the term that immediately follows it. If you want to use NOT to exclude a phrase, you need to put the phrase in brackets.
In Example 4, Neither Document 1 nor Document 2 are matched, because this query says: find documents that contain the search term "city" but that do not contain the search terms "New" or "York".
In Example 5, Only Document 1 but not Document 2 is matched, because this query says: find documents that contain the search term "city" but that do not contain the search term "New York".
A proximity search improves the search constraints by using proximity operators (NEAR, DNEAR, WNEAR, BEFORE, etc..) which allows you to give words that appear close together in the search string a higher weighting.
In Example 6, the search only returns documents in which the term "yahoo" is no more than 1 word away from the word "google".
In Example 7, the search only returns documents in which the term "yahoo" follows the term "google", but is no more than 1 word away form the term "google".
In Example 8, in this search extra relevance is given to documents in which "yahoo" and "google" appear within 7 words of each other in a piece of text. This weight increases as the terms get closer to each other.
In Example 9, the search only returns documents in which the term "google" appears later than the term "Yahoo".
In Example 10, the search only returns documents in which the term "yahoo" appears later than the term "Google".
You can use the following wildcards in the search terms:
Wildcard matching is done after stemming has taken place; this means that if you're searching for a term "rollersk*", system will return documents that contain any terms that have been stemmed to "rollersk", like "rollerskating", "rollerskater", "rollerskate", "rollerskates", etc...
Using ? to match a character:
In Example 11, the search returns documents that contain the term "Mikrotech" or "Microtech".
Using * to match zero, one or more characters:
In Example 12, , the search returns documents that contain "submit", "submitted", "submission", etc...