What I learned today: Data mining

The extraction of hidden predictive information from large databases.
Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information - information that can be used to increase revenue, cuts costs, or both.
Data mining software analyzes relationships and patterns in stored transaction data based on open-ended user queries.

Four types of relationships in data mining:

Classes: Stored data is used to locate data in predetermined groups.

Clusters: Data items are grouped according to logical relationships or consumer preferences.

Associations: Data can be mined to identify associations.

Sequential patterns: Data is mined to anticipate behavior patterns and trends.

Techniques

Artificial neural networks: Non-linear predictive models that learn through training and resemble biological neural networks in structure.

Decision trees: Tree-shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset. Specific decision tree methods include Classification and Regression Trees (CART) and Chi Square Automatic Interaction Detection (CHAID) .

Genetic algorithms: Optimization techniques that use processes such as genetic combination, mutation, and natural selection in a design based on the concepts of evolution.

Nearest neighbor method: A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) most similar to it in a historical dataset (where k ³ 1). Sometimes called the k-nearest neighbor technique.

Rule induction: The extraction of useful if-then rules from data based on statistical significance.

http://www.thearling.com/index.htm#wps - Information about data mining and analytic technologies
http://www.statsoft.com/textbook/stdatmin.html - white papers about data mining, etc.

http://www.thearling.com/dmintro/dmintro_frame.htm - data mining introduction presentation

среда, 25 ноября 2009 г.

Data mining

Комментариев нет:

Отправить комментарий

What I learned today

Архив блога

Поиск по этому блогу

среда, 25 ноября 2009 г.