Evolving Robust And Specialized Car Racing Skills

2y ago
14 Views
2 Downloads
273.03 KB
8 Pages
Last View : 6d ago
Last Download : 3m ago
Upload by : Troy Oden
Transcription

Evolving robust and specialized car racing skillsJulian TogeliusDepartment of Computer ScienceUniversity of Essex, UKjulian@togelius.comAbstract— Neural network-based controllers are evolved forracing simulated R/C cars around several tracks of varyingdifficulty. The transferability of driving skills acquired whenevolving for a single track is evaluated, and different ways ofevolving controllers able to perform well on many differenttracks are investigated. It is further shown that such generallyproficient controllers can reliably be developed into specializedcontrollers for individual tracks. Evolution of sensor parameterstogether with network weights is shown to lead to higher finalfitness, but only if turned on after a general controller isdeveloped, otherwise it hinders evolution. It is argued thatsimulated car racing is a scalable and relevant testbed forevolutionary robotics research, and that the results of thisresearch can be useful for commercial computer games.Keywords: Evolutionary robotics, games, car racing,driving, incremental evolutionI. I NTRODUCTIONCar racing is a remarkably popular preoccupation - both towatch and to participate in - be it in a computer simulationor in the “real world”. But it is not only popular, it isalso challenging: racing well requires fast and accuratereactions, knowledge of the car’s behaviour in differentenvironments, and various forms of real-time planning, suchas path planning and deciding when to overtake a competitor.In other words, it requires many of the core componentsof intelligence being researched within computational intelligence and robotics. The success of the recent DARPAGrand Challenge[1], where completely autonomous real carsraced in a demanding desert environment, may be taken asa measure of the interest in car racing within these researchcommunities.This paper deals with using evolutionary algorithms tocreate neural network controllers for simulated car racing.Specifically, we evolve controllers that have robust performance over different tracks, and can be specialized to workbetter on particular tracks.Evolutionary robotics (the use of evolutionary algorithmsfor embodied control problems) and simulated car racingare in many ways ideal companions. The benefit for thedevelopment of racing games and simulations is clear: evolutionary robotics offers a way to automatically developcontrollers, possibly specialized for specific tracks or typesof tracks, driving styles, skill levels, competitors etc. Onecould envision a racing simulator where the user is allowed toconstruct his own tracks and cars, and the game automaticallydevelops a set of controllers to drive these tracks. The gamecould also automatically adapt to the user’s driving style,Simon M. LucasDepartment of Computer ScienceUniversity of Essex, UKsml@essex.ac.ukor learn from other drivers (humans or machines) on theInternet.The benefits for evolutionary robotics might require someexplanation. While evolutionary robotics has successfullybeen used for various interdisciplinary investigations (e.g. ofmemory mechanisms, neural architectures and evolutionarydynamics), and for parameter tuning of some more complexcontrollers, its approximately 15 years of development havenot seen much scaling up[2]. That is, we have yet to see theevolution of robot controllers (as opposed to just parametersof such) for any really complex problem - problems whereartificial evolution becomes a superior alternative to manualdesign of controllers.We believe that some of the reason for this lack of progressis the limited environments, sensor data, embodiments, andtasks in most evolutionary robotics experiments. A typicalsuch experiment uses a semi-holonomic robot operating inan impoverished environment (in many ways resemblinga “Skinner box”, the simplistic boxes pioneered by B. F.Skinner for studying operant conditioning[3]), using simple,low-bandwidth sensor input, doing a task that is hard to incrementally scale up. The car racing task uses a more complexand interesting robot morphology, as a car is more complexto control than a semi-holonomic robot, but at the sametime it has more capabilites. While a simple racing trackmight be as impoverished an environment as ever a Skinnerbox, it can be scaled up. A controller might be evolvedto race a simple track, which can then be progressivelycomplexified (by adding competitors, gears, crossroads, blindalleys, bridges, jumps etc.) up to and above the level of theDARPA Grand Challenge, without ever changing the natureof the fitness function, thus ensuring smooth scaling up.This solution to the problems of the environment and taskscalability does come at cost: the car will probably need evermore sophisticated sensors, including high-bandwitdh visualinput, to navigate more complex tracks. But such input can besupplied, if we use one of today’s graphically sophisticatedracing games as experimental environment. This shifts theproblem to one of controller encodings that can handle suchcomplex input.A. Prior research1) Evolutionary car racing: A few investigations intoevolutionary car racing can be found in the recent literature. Togelius and Lucas[4] investigated various controllerarchitectures and sensor input representations for simulatedcar racing. It was concluded that the only combination out

of those studied that allows evolution to reliably producegood racing controllers uses neural networks informed byegocentric information from range-finder sensors. Best performance was achieved by making ranges and angles of therangefinders evolvable, and providing the network with afurther sensor indicating angle to the next waypoint. Thecontrollers were only tried on one track, but some noisewas introduced into the environment and the track wassurrounded by impenetrable walls. The best of the evolvedcontrollers outperformed all of a small sample human competitors.Stanley et al.[5] used a similar setup - neural networks informed by range-finders - in an experiment aimed at evolvingboth controllers and crash-warning systems for subsequentlyimpaired controllers. The experiment was conducted on asingle track in simulation (using the RARS simulator[6]),and the track was not surrounded by walls, so the car wasallowed (at a fitness penalty) to venture outside the track.In another interesting experiment, Floreano et al. evolvedneural networks for simulated car racing using first-personvisual input from the driving simulator Carworld[7][8]. However, only 5 x 5 pixels of the visual field was used as inputsfor the network; the position of these pixels was dynamicallyselected by the network, in a process known as active vision.A different approach to evolutionary car racing was takenby Tanev et al., who evolved parameters for a hand-codedracing car controller, using anticipatory modeling of thecar’s position[9]. While the amount of human input intothe controller design process is arguably higher in this case,this approach allowed evolution of controllers for real radiocontrolled cars without an intermediary simulation step.Also related is the work of Wloch and Bentley, who useda human-designed controller built into a high quality racingsimulator, but used artificial evolution to optimize all physicaland mechanical parameters of the car[10]. Evolution heremanaged to come up with car configurations that performedbetter than any of the stock cars in the simulator.2) Supervised learning and real-world applications: Machine learning techniques have also been used in real-worldcar driving and racing applications, though these techniqueshave been forms of supervised learning rather than evolutionary learning. Perhaps most well-known of these isPomerleau’s use of backpropagation to train neural networksto associate pre-processed video input with a human driver’sactions, leading to a controller able to keep a real car onthe road and take curves appropriately[11]. More recently,the winning team in the DARPA Grand Challenge madeextensive use of machine learning in developing their carcontroller.Going from physical reality to virtual reality, the Microsoft’s Xbox video game Forza Motorsport is worthyof mention, as all the opponent car controllers have beentrained by supervised learning of human player data, insteadof the usual racing game technique of blindly followingprecalculated racing lines[12]. The player can even train hisown “drivatars” to race tracks in his place, after they haveacquired his or her individual driving style.Supervised learning, however, ultimately suffers from requiring good training data. Sometimes such training datais simply not available, at other times it is prohibitivelyexpensive to obtain, and at yet other times imitating humandrivers is simply not what we want.B. Motivations for this paperWhile the research referred to above has shown the usefulness of evolutionary robotics techniques for car racing, thecontrollers have in all those cases only been tested on a singletrack, and sometimes with severe simplifying assumptions,such as being able to drive through walls. Thus, the firstobjective of the research reported in this paper is to evolveneural network controllers each capable of competitively andreliably navigating a variety of different tracks, includingtracks they have not been trained on. Based on the rangefinding and aimpoint sensors proposed in[4], we investigatewhich sensor setup and evolutionary sequence allows us tocreate such controllers.A second objective is to investigate whether evolution ofa specialized controller, i.e. one performing very well on aparticular track, can be sped up by starting from an alreadyevolved “general” controller. Such a process could be usefulfor example in a racing game, where users are allowed todesign tracks and a controller providing good performanceon such tracks needs to be created on the fly.The concrete questions we pose and try to answer arethe following: How robust is the evolutionary algorithm,that is, how certain can we be that a given evolutionaryrun will produce a proficient controller for a given track?Is the layout of the racing track directly influencing thefitness landscape so that some tracks are much harder thanothers to evolve, while not being impossible to drive? Whatis the transferability of knowledge gained in evolving forone track in terms of performance on other tracks? Can weevolve controllers that can proficiently race all tracks in ourtraining set? How? Can such generally proficient controllersbe used to reliably create specialized controllers that performwell, but only on particular tracks? Finally, can this be doneeven for tracks for which it is not possible to evolve a goodcontroller from scratch?While this investigation primarily addresses the scalabilityof the problem domain (and to some extent of the sensor/network combination), it may also be of use for practicalapplications such as racing games to find out the mostreliable ways to evolve proficient controllers.C. Overview of the paperThe paper is laid out as follows: first, we describe thecharacteristics of the car racing simulation we will be using,including sensor models, tracks, and how this models differsfrom the problem of racing real radio-controlled cars. Thenext section details the neural networks and evolutionary algorithm we employ. We then proceed to describe experimentson evolving controllers optimized for the individual tracksfrom scratch, followed by a section where we investigate

Fig. 1. The eight tracks. Notice how tracks 1 and 2 (at the top), 3 and4, 5 and 6 differ in the clockwise/anti-clockwise layout of waypoints andassociated starting points. Tracks 7 and 8 have no relation to each otherapart from both being difficult.damaging such cars in collisions is harder due to their lowweight.The dynamics of the car are based on a reasonably detailedmechanical model, taking into account the small size of thecar and bad grip on the surface, but is not based on any actualmeasurement [13][14]. The model is similar to that used in[4], and differs mainly in its improved collision handling;after more experience with the physical R/C cars the collisionresponse system was reimplemented to make collisions morerealistic (and, as an effect, more undesirable). Now, a collisonmay cause the car to get stuck if the wall is struck at anunfortunate angle, something often seen in experiments withphysical cars.A track consists of a set of walls, a chain of waypoints,and a set of starting positions and directions. When a caris added to a track in one of the starting positions, withcorresponding starting direction, both the position and anglebeing subject to random alterations. The waypoints are usedfor fitness calculations.For the experiments we have designed eight differenttracks, presented in figure 1. The tracks are designed tovary in difficulty, from easy to hard. Three of the tracksare versions of three other tracks with all the waypointsin reverse order, and the directions of the starting positionsreversed.The main differences between our simulation and thereal R/C car racing problem have to do with sensing. Asreported in Tanev et al. as well as [4], there is a small butnot unimportant lag in the communication between camera,computer and car, leading to the controller acting on outdatedperceptions. Apart from that, there is often some errorin estimations of the car’s position and velocity from anoverhead camera. In contrast, the simulation allows instantand accurate information to be fed to the controller.III. E VOLVABLE INTELLIGENCEhow to evolve controllers that provide robust performanceover several tracks. These controllers are then validated ontracks for which they have not been evolved. Finally, thesecontrollers are further evolved to provide better fitness onspecific tracks, conclusions are drawn, and further researchis suggested.II. T HE CAR RACING MODELThe experiments in this article were performed in a2-dimensional simulator, intended to qualitatively if notquantitatively, model a standard radio-controlled (R/C) toycar (approximately 17 centimeters long) in an arena withdimensions approximately 3*2 meters, where the track isdelimited by solid walls. The simulation has the dimensions400*300 pixels, and the car measures 20*10 pixels.R/C toy car racing differs from racing full-sized cars inseveral ways. One is the simplified controls; many R/C carshave only three possible drive modes (forward, backward,and neutral) and three possible steering modes (left, rightand center). Other differences are that many toy cars havebad grip on many surfaces, leading to easy skidding, and thatA. SensorsThe car experiences its environment through two typesof sensors: the waypoint sensor, and the wall sensors. Thewaypoint sensor gives the difference between the car’s current orientation and the angle to the next waypoint (but notthe distance to the waypoint). When pointing straight to awaypoint, this sensor thus outputs 0, when the waypoint isto the left of the car it outputs a positive value, and vice versa.As for the wall sensors, each sensor has an angle (relative tothe orientation of the car) and a range, between 0 and 200pixels. The output of the wall sensor is zero if no wall isencountered along a line with the specified angle and rangefrom the centre of the car, otherwise it is a fraction of one,depending on how close to the car the sensed wall is. A smallamount of noise is applied to all sensor readings, as it is tostarting positions and orientations.In some of the experiments the sensor parameters aremutated by the evolutionary algorithm, but in all experimentsthey start from the following setup: one sensor points straightforward (0 radians) in the direction of the car and has

T rack12345678100.32 (0.07)0.38 (0.24)0.32 (0.09)0.53 (0.17)0.45 (0.08)0.4 (0.08)0.3 (0.07)0.16 (0.02)500.54 (0.2)0.49 (0.38)0.97 (0.5)1.3 (0.48)0.95 (0.6)0.68 (0.27)0.35 (0.05)0.19 (0.03)1000.7 (0.38)0.56 (0.36)1.47 (0.63)1.5 (0.54)0.95 (0.58)1.02 (0.74)0.39 (0.09)0.2 (0.01)2000.81 (0.5)0.71 (0.5)1.98 (0.66)2.33 (0.59)1.65 (0.45)1.29 (0.76)0.46 (0.13)0.2 (0.01)P r.22798500TABLE IT HE FITNESS OF THE BEST CONTROLLEROF VARIOUS GENERATIONS ONTHE DIFFERENT TRACKS , AND NUMBER OF RUNS PRODUCINGPROFICIENT CONTROLLERS .Fig. 2. The initial sensor setup, which is kept throughout the evolutionaryrun for those runs where sensor parameters are not evolvable. Here, the caris seen in close-up moving upward-leftward. At this particular position, thefront-right sensor returns a positive number very close to 0, as it detects awall near the limit of its range; the front-left sensor returns a number closeto 0.5, and the back sensor a slightly larger number. The front, left and rightsensors do not detect any walls at all and thus return 0.range 200 pixels, as has three sensors pointing forwardleft, forward-right and backward respectively. The two othersensors, which point left and right, have reach 100; this isillustrated in figure 2.B. Neural networksThe controllers in the experiments below are based onneural networks. More precisely, we are using multilayerperceptrons with three neuronal layers (two adaptive layers)and tanh activation functions. A network has at least threeinputs: one fixed input with the value 1, one speed inputin the approximate range [0.3], and one input from thewaypoint sensor, in the range [-Π.Π]. In addition to this,it might have any number of inputs from wall sensors, inthe range [0.1]. All networks have two outputs, which areinterpreted as driving commands for the car.C. Evolutionary algorithmThe genome is an array of floating point numbers, ofvariable or fixed length depending on the experimental setup.Apart from information on the number of wall sensors andhidden neurons, it encodes the orientation and range of thewall sensors, and weights of the connections in the neuralnetwork.The evolutionary algorithm used is a kind of evolutionarystrategy, with µ 50 and δ 50. In other words, 50genomes (the elite) are created at the start of evolution. Ateach generation, one copy is made of each genome in theelite, and all copies are mutated. After that, fitness value iscalculated for each genome, and the 50 best individuals ofall 100 form the new elite.There are two mutation operators: Gaussian mutationof all weight values, and Gaussian mutation of all sensorparameters (angles and lengths), which might be turned onor off. In both cases, the standard deviation of the Gaussiandistribution was set to 0.3.Last but not least: the fitness function. The fitness of acontroller is calculated as the number of waypoints it hasF ITNESS AVERAGED OVER 10SEPARATEEVOLUTIONARY RUNS ; STANDARD DEVIATION BETWEEN PARENTHESES .passed, divided by the number of waypoints in the track,plus an intermediate term representing how far it is on its wayto the next waypoint, calculated from the relative distancesbetween the car and the previous and next waypoint. Afitness of 1.0 thus means having completed one full trackwithin the alloted time. Waypoints can only be passed in thecorrect order, and a waypoint is counted as passed when thecentre of the car is within 30 pixels from the waypoint. Inthe evolutionary experiments reported below, each car wasallowed 700 timesteps (enough to do two to three laps onmost tracks in the test set) and fitness was averaged overthree trials.IV. E VOLVING TRACK - SPECIFIC CONTROLLERSThe first experiments consisted in evolving controllers forthe eight tracks separately, in order to the test the softwarein general and to rank the difficulty of the tracks.For each of the tracks, the evolutionary algorithm was run10 times, each time starting from a population of “clean”controllers, with all connection weights set to zero and sensorparameters as explained above. Only weight mutation wasallowed. The evolutionary runs were for 200 generationseach.A. Fixed sensor parameters1) Evolving from scratch: The results are listed in table I,which is read as follows: each row represents th

crosoft’s Xbox video game Forza Motorsport is worthy of mention, as all the opponent car controllers have been trained by supervised learning of human player data, instead of the usual racing game technique of blindly following precalculated racing lines[12]. The player can even train h

Related Documents:

9/8/2022 Club Car Wash Sites of Tidal Wave Express Car Wash 8 8/29/2022 Take 5 Car Wash Soft Touch Car Wash, Auto Oasis Car Wash, Clearwater Car Wash and Birdie's Car Wash 5 8/25/2022 WhiteWater Express Geaux Clean Car Wash 7 8/19/2022 ModWash Home Team Car Wash 3 8/18/2022 Splash In ECO Car Wash (Wills Group) Blue Hen Car Wash 2

1. Stacker type car parking system 2. Puzzle type car parking system 3. Level type car parking system 4. Chess type car parking system 5. Rotary type car parking system 6. Tower type car parking system But lift is used only in tower type car parking system. Objectives:-

your Infant Car Seat, as described in the instruction manual provided by the Infant Car Seat manufacturer. † WHEN USING ONLY ONE INFANT CAR SEAT ADAPTER OR TWO FOR TWINS, THE FOLLOWING INFANT CAR SEATS CAN BE USED: † If your Infant Car Seat is not one of the models listed above, DO NOT use your infant car seat with this car seat adapter.

last minute cruise deals -58.50% Car Rental Queries WoW Change car rental -43.80% rental cars -46.30% car rentals -40.60% cheap car rentals -48.00% car rentals cheapest rates -52.20% rent a car- 40.30% cheap rental cars -45.60% rental car -41.80% car rental deals -49.30% rental cars lowest price -53.90% Flight Queries WoW Change cheap flights .

Car-O-Tronic, Vision2 Software and Car-O-Data. Car-O-Tronic is the measuring hardware, Vision2 Software is the measuring software. Car-O-Data is a database containing Car-O-Liner DataSheets, photo DataSheets and indexes for most vehicles. Car-O-Data is available through an online subscription or a DVD subscription which is updated 4 times a year.

To own any car, you need to beat the challenge that is set for that car, customize the car, and then the car becomes yours for use in street races. When a car is in your Garage, you can select it as the car that you drive in the races at any time, by highlighting it and choosing the 'Drive Car' option. 3 MSR UK 9/27/00 4:48 PM Page 24

CAR-O-DATA. 4. The vast majority of vehicles on the road today can be found in Car-O-Liner's database. Your . Car-O-Tronic. is delivered with a 14-day trial . Car-O-Data Vision2. subscription. Car-O-Data. is available with different subscription periods and database. 4. Check all options with our distributors. SOFTWARE PART. NO. Vision2 X1 .

Queueing Theory-12 Car Wash Example Consider the following 3 car washes Suppose cars arrive according to a Poisson input process and service follows an exponential distribution Fill in the following table What conclusions can you draw from your results? ! µ! L L q W W q P 0 Car Wash A 0.1 car/min 0.5 car/min Car Wash B 0.1 car/min