Unstructured data

From Wikipedia, the free encyclopedia

(Redirected from Unstructured information)
Jump to: navigation, search

Unstructured data (or unstructured information) refers to masses of (usually) computerized information which do either not have a data structure or one that is not easily readable by a machine. The term is imprecise: software that creates machine-processable structure exploits word morphology, sentence syntax, and other small- and large-scale patterns found in source materials to discern linguistic, auditory, and visual structure that is inherent in all forms of human communication.[1] Examples of "unstructured data" may include audio, video and unstructured text such as the body of an email or word processor document.

Merrill Lynch estimates that more than 85% of all potentially usable business information originates in unstructured form.[2]

Data with some form of structure may also be referred to as unstructured data if the structure is not helpful for the desired processing task. For example, an HTML Web page is tagged, but this form of structure is typically oriented towards formatting rather than capturing the meaning or function of the tagged elements in was that support automated processing of the information content of the page.

Contents

Data mining and text analytics techniques are different methods used to find patterns in, or otherwise interpret, this information. Common techniques for structuring text usually involve manual tagging with metadata or Part-of-speech tagging for further text mining-based structuring. UIMA provides a common framework for processing this information to extract meaning and create structured data about the information.

  1. ^  Structure, Models and Meaning: Is "unstructured" data merely unmodeled?, Intelligent Enterprise, March 1, 2005.
  2. ^  The problem with unstructured data, DMReview, February 2003.

Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.