Jun 25, 2018 in Research

Data Mining and the Definition of Knowledge


The paper critically looks into the issue of data mining and definition of knowledge. It covers issues such as definition of data mining, the entire process, analysis and elements involved in summary, applications for instance in business, surveillance, agriculture, science. Finally, the term paper brings to light challenges facing this field.

According to Kantardzic, 2003 data mining has replaced the manual extraction of data which has been in existence for centauries. By definition, data mining refers to the process and techniques by which we can sift through very large amount of data to come up with useful information. The technique employs man-made intelligence mechanisms, neural networks and sophisticated tools of statistics analysis for instance cluster analysis which helps us to clearly show patterns, trends, correlation, relationships that could be otherwise difficult or even impossible to be detected.

It is worth noting that data mining has been thought to be applicable in each and every field. It is a significant in that it has helped foster effectiveness in businesses and organization more by cutting costs and increasing revenue for profit making organization. Despite its significance and pivotal role in creating operational effectiveness, there is need for users to be knowledgeable about when to stop the process before it turns out to be a monster than it’s worth (Lindorff, 2003)

Processes, analysis and elements of data mining

Data mining involves four main steps which are identifying the source of the data and selecting the target data, pre-process then follows which include cleaning and attribution, data mining that entail extraction, and finally post-process that involves pointing out useful patterns and validation that are understandable (Wang, 1999).

Models used for data mining include; decision tree which is a set of rules developed and mapped out to look like branches of a tree which lead to very large values, it uses C4.5 algorithm, rule induction involves mechanism in which data is induced giving value to datasets for us to be capable of establishing where there is attentiveness of linked attributes, regression models are math’s equations that are used to help establish relationship or association between and among factors, finally neural networks which are statistics programs that aid in data classification.

Application and importance of data mining

Data mining is applicable to a myriad of sectors. The notable sectors where it has been successfully and profitably used include science and engineering, business, surveillance. In business, it has been successfully be used to catalogue, automatic mailing, optimization of resources which help in choosing the appropriate channel/offer/customers, it also helps in segmenting customers, in human resources department, data mining can help them study the characteristics of their top performing employees as us this to set a profile as a criteria to employ future workers who will bring success to the company.

Additionally, data mining has been used to detect fraudulent transactions when it comes to use of credit cards, purchasing patterns of alpha customers, it also helps businesses to translate corporate-level goals for instance profit and margin share targets into operational decision for example production plans and workforce level. Last but not least, data mining has been used in detecting defects in the manufacturing industries, learning those characteristics that lead to production of lower quality products help the relevant authorities fix it hence coming up with quality products that will appeal and meet the need and aspiration of customers.

Generally speaking data mining help businesses in acquiring the latest information about the current market which when acted upon rationally and at the right time give the organization a competitive advantage (Wang, 1999). As previously mention, the information include competition analysis, economic patterns, behavior of consumers, geographic analysis, security to curb fraud, market research among others.

Privacy concern, ethics and challenges

According to Noyes, (2004), despite the fact that data mining has been documented to have with it a myriad of advantages when appropriately adopted, there are instances where it has led to more disastrous implications beyond what it was initially thought to bringing operational effectives. It is worth noting that possessing the knowledge of when and how to stop the process before it turn out to be a night mere to organization is a critical thing. Many scholars have pinpointed out that stopping the process especially when privacy and ethical issues tend or arise is key.

It has been established that data mining can turn out to be a misleading concept contributing very expensive and dangerous decision making which are in themselves lethal to businesses. According to Noyes (2004), this is attributed to the fact that when organizations become adept in data extraction and analysis, they often become very slow in taking actions.

An example of where data mining worked was in Parkway Corp.’s in Philadelphia. Although others were not lucky from the onset, Parkway indeed reap the benefit of data mining for instance, revenue increased by 13% in 1998 one year down the line from 1997. This is because the technology helped them cut down on costs and increasing revenue (Lindorff, 2003).

Having in mind that a times data mining can turn to be a nightmare within and without an organization, the appropriate time to stop using it is when for instance costs out-ran benefits, data generated cannot be used in making decision, information licks out to unauthorized individuals, there is no possibility of integrating various data.

Data mining can also be stopped when the software are not well known to be effective, as this can lead to misuse of funds. Since, it is an organization change, if there are reasons why it should not be adopted, and then it opts to be stopped. Since data mining involves very tedious and cumbersome work as well as huge budgetary allocation; when these are coupled with other factors such as efficiency of the software, it will only be sound to not adopting it as it trigger the fall down of an organization (Klingler, 2002).


From the review, data mining is a process or technique which has replaced manual extraction of data which has been on existence for centauries now. It entails sifting data to come up with useful information from a large pool of data. It has been established that data mining has the potential to be applied in various sectors for instance science and engineering, business, surveillance, agriculture to mention but a few.

Importance of the process include fraud detection, defect detection, market segmentation, market analysis, all these in effect help organizations and businesses cut themselves an edge in this competitive business world as they are in a position to acquire information about the existing market.

Although it is an important process, care should be exercised to ensure that the desired benefits are not down played by the negative implications of the process. Organization need to be aware concerning ethical issues as well as privacy ones which may turn out to be disastrous, with this knowledge therefore, businesses can know where and how to stop the process before things turn bad. This can be countered by passing legislations that would help curb data mining, clearly indicating the purpose of collecting the data, how it will be used, by who, addressing security issues as well as how data will be updated. More importantly, there is need to understand and balance between importance of people and business goal-oriented process.

Related essays