Saturday, April 25, 2009

Assignment 8



PCI Chapter 5, Optimization



We as a group divided the chapter down and each created slides based on our assigned sections. Once completed we assembled the slides, and there were some interesting things that we learned.



One of the most important pieces to an optimization problem is the cost function. The cost function is also the most difficult things to determine. An optimization problem tries to minimize the cost function. There are a few methods of optimization. The first is Random Searching. Random Searching is just what the name implies. It is just random guessing. It is only really good for using as a baseline against other algorithms.




Hill Climbing is a variation of Random Searching. It finds its closest neighbors and finds the best value. It is only able to find the local minimum rather than the global one. Simulated Annealing uses random searching to find an initial value. The algorithm looks for progressively better values. It is also much more efficient when it comes to finding the global minimum. Genetic Algorithms start with a set of random solutions. It takes the best solutions and the rest are considered modifications of the best ones. It continues over an over until no improvement is shown. Time must also be spent deciding on how to represent the solution. Depending on the type of problem, the solution may vary on how it is represented.



One last thing to keep in mind is displaying the data. As with all data the visualization has to be a healthy mix of human understandable format, and enough detail to still be relevant. The algorithms described before do a good job of finding the solution, but displaying it can be a whole other issue. It has to be human readable and still retain its relevance.



Below are 2 example of the same data:



BAD






GOOD





These 2 pictures show the exact same data, but the second is much more human readable. This because the lines are not crossed. Keeping lines from crossing is the most common way of making data easy to read. This is done by counting lines, keeping track of lengths and positions and using an algorithm to make sure that none of these lines are crossing using this data. Ironically enough, this is very often done using a genetic algorithm. Also, to keep the data from being oddly distanced, min and max line lengths are often declared. This keeps thedata easy to read and it also keeps it auto genereated, which is important when dealing with very large data sets.

No comments:

Post a Comment