Data (or Data Mining) mining is a process of extracting information and search for patterns of behavior that are hidden between large amounts of information at a glance. There are some tools designed to extract knowledge from databases that contain large amounts of information. The most popular of these tools are SPSS Clementine, Oracle Data Miner and Weka. This last tool is the most popular and affordable since it is developed in Java and licensed under GPL. Weka specifically allows you to load data to analyze from a database, a .csv file or files .arff (the Weka own format). Suppose that we have a data set in rows that are grouped together forming groups to clusters.
Once loaded in Weka data, we can use the tool to identify the most relevant data that allow to classify the data in these groups. Moreover, we can use that data to create decision trees or classification rules that help us understand by which each row data It falls in a particular group. We can also use this tool for data mining to classify our data into groups (clusters) using as e.g. k-means clustering algorithms. Ultimately, this type of tools to delight fans of statistics eager to squeeze the data for more information. This is only a small part of what allows Weka, but if we have access to some of the other tools of payment I mentioned, we will be amazed with the possibility of still more things to do. Best regards. Locualo webmaster. Original author and source of the article