Tuesday, January 21, 2020

Data Mining and Data Warehousing

data-mining-vs-data-warehousingData Mining and Data Warehousing both are used to holds business intelligence and enable decision making. But both, data mining and data warehousing have different aspects of operating on an enterprise’s data. On the one hand, the data warehouse is an environment where the data of an enterprise is gathering and stored in a aggregated and summarized manner.

On the other hands, data mining is a process; that apply algorithms to extract knowledge from the data that you even don’t know exist in the database.
Let us check out the difference between data mining and data warehousing with the help of a comparison chart shown below.

Comparison Chart

Basis for comparisonData MiningData Warehousing
Basic Data mining is a process to retrieve or extract meaningful data from database/data warehouse.Data warehouse is a repository where the information from multiple sources is stored under a single schema.

Definition of Data Mining

Data Mining is a process to discover Knowledge, which you never expected to exist in your database. Using traditional query tool you can only retrieve the known information from the data. But, Data mining provides you with the way to retrieve hidden information out of the data. Data mining extracts meaningful information from the database that can be used for decision-making.
The knowledge discovery in databases, referred as KDD, exhibits relationship and pattern. The relationship may be between two or more different objects, between attributes of the same object. Pattern is an another outcome of data mining that show the regular and intelligible sequence of information that helps in decision-making.
The steps involved in KDD i.e. Knowledge Discovery in Databases can be summarized as first, selection of data set on which data mining has to be performed. Next is pre-processing which involve removal of inconsistent data. Then comes data transformation where the data is transformed into the form appropriate for data mining.
Next is data mining, here the data mining algorithms are applied to the data. And finally, interpretation and evaluation which involve extracting the relation or pattern among the data.
Data mining fits well in the data warehouse environment that has stored data in an aggregated and summarized manner. As it becomes easy to mine the data in data warehouse

Defining of Data Warehousing

Data Warehouse is a central location where information gathered from multiple sources are stored under a single unified schema. The data is initially gathered, different sources of enterprise then cleaned and transformed and stored in a data warehouse. Once data is entered in a data warehouse, it stays there for a long time and can be accessed overtimes.
Data Warehouse is a perfect blend of technologies like data modelling, data acquisition, data management, metadata management, development tools store managements. All these technologies support functions like data extraction, data transformation, data storage, providing user interfaces for accessing the data.
Data warehouse is not a product or software, it is an informational environment, which provides information like an integrated view of an enterprise. You can access enterprise’s current and historical data which helps in decision making. It supports transactions made for decision making without affecting operational systems. It is a flexible resource to obtain strategic information.

Key Differences Between Data Mining and Data Warehousing

  1. There is a basic difference that separates data mining and data warehousing that is data mining is a process of extracting meaningful data from the large database or data warehouse. However, data warehouse provides an environment where the data is stored in an integrated form which ease data mining to extract data more efficiently.

Conclusion

Data Mining can be done only when there is a well integrated large database i.e. data warehouse. So data warehouse must be completed before data mining. Data warehouse must have information in well-integrated form so that data mining can extract the knowledge in an efficient manner.

No comments:

Post a Comment