Decision tree

From Wikipedia, the free encyclopedia

(Redirected from Decision trees)
Jump to: navigation, search

In operations research, specifically in decision analysis, a decision tree (or tree diagram) is a decision support tool that uses a graph or model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. A decision tree is used to identify the strategy most likely to reach a goal. Another use of trees is as a descriptive means for calculating conditional probabilities.

In data mining and machine learning, a decision tree is a predictive model; that is, a mapping from observations about an item to conclusions about its target value. More descriptive names for such tree models are classification tree (discrete outcome) or regression tree (continuous outcome). In these tree structures, leaves represent classifications and branches represent conjunctions of features that lead to those classifications. The machine learning technique for inducing a decision tree from data is called decision tree learning, or (colloquially) decision trees.

Contents

In decision analysis, a "decision tree" — and a closely related model form, an influence diagram — is used as a visual and analytical decision support tool, where the expected values (or expected utility) of competing alternatives are calculated.

For example a factory makes product B. The manager has to decide to invest in development for a new product - product A or product C. (She cannot do both due to budget constraints.) Product A is estimated to require two million dollars of R&D investment, but only has a 50% chance of the research being successful and a product being obtained. It will have a 30% chance of selling $5M profit, a 40% chance of selling $10M profit, and a 30% chance of no sales. Product C, on the other hand, will also cost $2M in R&D but has an 80% chance of selling $5M profit and a 20% chance of no sales. $1M is the manufacturing cost for either product.

If the company has a policy of maximising expected values, which is the preferred strategy? The alternatives, probabilities, payoffs, and resulting expected value calculations are shown in the example tree below. In this case either Product A or Product C are expected to turn a profit but product C has the higher expected value of $1 million:

Image:dtree2.png

The same example again, this time taking account of the time value of money by discounting to Net Present Values, for this scenario it can be seen that Product C is clearly the winning choice with a payout of $0.36 million. Product A is not expected to turn a profit.

Image:dtreenpva2.png

Analysis can take into account the decision maker's (e.g., the company's) preference or utility function, for example:

Image:RiskPrefSensitivity2Threshold.png

The basic interpretation in this situation is that the company prefers B's risk and payoffs under realistic risk preference coefficients (greater than $400K -- in that range of risk aversion, the company would need to model a third strategy, "Neither A nor B").

A decision tree can be represented more compactly as an influence diagram, focusing attention on the issues and relationships between events.

Image:Factory2_InfluenceDiagram.png

Decision trees, influence diagrams, utility functions, and other decision analysis tools and methods are taught to undergraduate students in schools of business, health economics, and public health, and are examples of operations research or management science methods.

Three popular rules are applied in the automatic creation of classification trees. The Gini rule splits off a single group of as large a size as possible, whereas the entropy and twoing rules find multiple groups comprising as close to half the samples as possible. Both algorithms proceed recursively down the tree until stopping criteria are met.

Amongst decision support tools, decision trees (and influence diagrams) have several advantages:

Decision trees:

  • are simple to understand and interpret. People are able to understand decision tree models after a brief explanation.
  • have value even with little hard data. Important insights can be generated based on experts describing a situation (its alternatives, probabilities, and costs) and their preferences for outcomes.
  • use a white box model. If a given result is provided by a model, the explanation for the result is easily replicated by simple math.
  • can be combined with other decision techniques. The following example uses Net Present Value calculations, PERT 3-point estimations (decision #1) and a linear distribution of expected outcomes (decision #2):

Image:Investment Decision Occam s Tree.gif

  • can be used to optimize an investment portfolio. The following example shows a portfolio of 7 investment options (projects). The organization has $10,000,000 available for the total investment. Bold lines mark the best selection 1, 3, 6 and ,7 which will cost $7,740,000 and create a payoff of 2,710,000. All other combinations would either exceed the budget or yield a lower payoff:

Image:Project Investment Portfolio Occam s Tree.gif

Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.