
By Clarisse Dhaenens, Laetitia Jourdan
Large facts is a brand new box, with many technological demanding situations to be understood so one can use it to its complete power. These demanding situations come up in any respect phases of operating with substantial info, starting with info iteration and acquisition. The garage and administration part provides severe demanding situations: infrastructure, for garage and transportation, and conceptual versions. ultimately, to extract that means from vast information calls for complicated research. right here the authors suggest utilizing metaheuristics as an answer to those demanding situations; they're first in a position to take care of huge measurement difficulties and secondly versatile and for that reason simply adaptable to types of facts and varied contexts.
The use of metaheuristics to beat a few of these information mining demanding situations is brought and justified within the first a part of the publication, along a selected protocol for the functionality review of algorithms. An advent to metaheuristics follows. the second one a part of the ebook information a couple of facts mining initiatives, together with clustering, organization principles, supervised class and have choice, ahead of explaining how metaheuristics can be utilized to accommodate them. This ebook is designed to be self-contained, in order that readers can comprehend the entire innovations mentioned inside it, and to supply an summary of contemporary functions of metaheuristics to wisdom discovery difficulties within the context of huge Data.
Read or Download Metaheuristics for Big Data PDF
Similar computer science books
Designed to provide a breadth first assurance of the sphere of machine technology.
Every one variation of creation to info Compression has largely been thought of the simplest advent and reference textual content at the artwork and technological know-how of knowledge compression, and the fourth version maintains during this culture. facts compression suggestions and expertise are ever-evolving with new purposes in snapshot, speech, textual content, audio, and video.
Desktops as elements: ideas of Embedded Computing process layout, 3e, provides crucial wisdom on embedded platforms know-how and strategies. up-to-date for today's embedded structures layout equipment, this variation gains new examples together with electronic sign processing, multimedia, and cyber-physical structures.
Computation and Storage in the Cloud: Understanding the Trade-Offs
Computation and garage within the Cloud is the 1st accomplished and systematic paintings investigating the difficulty of computation and garage trade-off within the cloud so one can lessen the general software price. clinical purposes tend to be computation and information extensive, the place complicated computation projects take many years for execution and the generated datasets are frequently terabytes or petabytes in measurement.
Extra resources for Metaheuristics for Big Data
Example text
On the other hand, in unsupervised learning, feature selection aims to find a good subset of features that forms high-quality clusters for a given number of clusters. In supervised learning, three approaches exist according to the interaction with the classification procedure: – filter approaches evaluate features according to their characteristics to select (or not) them; – wrapper approaches evaluate the quality of a subset of features using a learning algorithm, for example; – embedded approaches combine the two aforementioned approaches by incorporating in a wrapper approach a deeper interaction between attribute selection and classifier construction.
If the size of the problem is small, an exact algorithm can find the optimal solution (for example, branch-and-bound or dynamic programming). Unfortunately, these algorithms are based on enumerative procedures and may not be used on large size problems (although, in fact, the size is not the only limiting criterion). In the last case, it is recommended to use heuristic methods to find good solutions in a reasonable time, even if the optimality is not guaranteed. Among these methods there are either specific heuristic methods developed for a dedicated problem or metaheuristics that offer generic resolution schemes that can potentially be adapted to any type of optimization problem.
Regarding data mining problems, many encodings have been proposed. Among the most famous ones, we may cite: – Binary encoding: the solution is represented by a vector of n binary values, representing decision variables of the problem. The search space is of size 2n . – Vector of discrete values: variables are not limited to binary values, but they may take discrete values. – Vector of real values: variables can have real values. – Permutation: the solution is described by a permutation of size n.