Data mining is the process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. In other words, data mining is mining knowledge from data. It uses sophisticated mathematical algorithms to segment the data and evaluate the probability of future events.
The term "data mining" was used in a similarly critical way by economist Michael Lovell in an article published in the Review of Economic Studies 1983. The term data mining appeared around 1990 in the database community, generally with positive connotations. For a short time in the 1980s, a phrase "database mining"™, was used, but since it was trademarked by HNC, a San Diego-based company, to pitch their Database Mining Workstation; researchers consequently turned to data mining.
While large-scale information technology has been evolving separate transaction and analytical systems, data mining provides the link between the two. Data mining software is one of a number of analytical tools for analyzing data. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. It is also able to answer questions that cannot be addressed through simple query and reporting techniques. Generally, any of the four types of relationships are sought:
- Classes - Stored data is used to locate data in predetermined groups.
- Clusters - Data items are grouped according to logical relationships or consumer preferences.
- Sequential patterns - Data is mined to anticipate behavior patterns and trends.
- Associations - Data can be mined to identify associations. The beer-diaper example is an example of associative mining.
Data mining tools and techniques
Data mining techniques are used in many research areas, including mathematics, cybernetics, genetics, and marketing. While data mining techniques are a means to drive efficiencies and predict customer behavior, if used correctly, a business can set itself apart from its competition through the use of predictive analysis.
Web mining - a type of data mining used in customer relationship management, integrates information gathered by traditional data mining methods and techniques over the web. Web mining aims to understand customer behavior and to evaluate how effective a particular website is.
Other data mining techniques include network approaches based on multitask learning for classifying patterns, ensuring parallel and scalable execution of data mining algorithms, the mining of large databases, the handling of relational and complex data types, and machine learning.
Benefits of data mining
In general, the benefits of data mining come from the ability to uncover hidden patterns and relationships in data that can be used to make predictions that impact businesses. Today, data mining is primarily used by companies with a strong consumer focus - retail, financial, communication, and marketing organizations. It enables the companies to determine relationships among internal factors such as price, product positioning, or staff skills, and external factors such as economic indicators, competition, and customer demographics. And, it enables them to determine the impact on sales, customer satisfaction, and corporate profits.
With data mining, a retailer could use point-of-sale records of customer purchases to send targeted promotions based on an individual's purchase history. By mining demographic data from comment or warranty cards, the retailer could develop products and promotions to appeal to specific customer segments.