SLAM: From Frames To Events

1y ago
3 Views
1 Downloads
4.95 MB
53 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Josiah Pursley
Transcription

Institute of Informatics Institute of NeuroinformaticsSLAM: from Frames to EventsDavide Scaramuzzahttp://rpg.ifi.uzh.ch1

Research TopicsReal-time, Onboard Computer Vision and Control for Autonomous, Agile Drone FlightP. Foehn et al., AlphaPilot: Autonomous Drone Racing, RSS 2020, Best System Paper Award. PDF Video2

Research TopicsReal-time, Onboard Computer Vision and Control for Autonomous, Agile Drone FlightKaufmann et al., Deep Drone Acrobatics, RSS 2020, Best Paper Award finalist. PDF. Video.3

Research TopicsReal-time, Onboard Computer Vision and Control for Autonomous, Agile Drone Flight4Loquercio et al., Agile Autonomy: Learning High Speed Flight in the Wild, Science Robotics., 2021 PDF. Video. Code & Datasets

SLAM: from Frames to Events5

Today’s Outline A brief history of visual SLAM SVO and real-world applications Active exposure control Event cameras6

A Brief history of Visual Odometry & SLAM Scaramuzza, D., Fraundorfer, F., Visual Odometry: Part I - The First 30 Years and Fundamentals, IEEE Robotics andAutomation Magazine, Volume 18, issue 4, 2011. PDF Fraundorfer, F., Scaramuzza, D., Visual Odometry: Part II - Matching, Robustness, and Applications, IEEE Robotics andAutomation Magazine, Volume 19, issue 1, 2012. PDF C. Cadena, L. Carlone, H. Carrillo, Y. Latif, D. Scaramuzza, J. Neira, I.D. Reid, J.J. Leonard, Past, Present, and Future ofSimultaneous Localization and Mapping: Toward the Robust-Perception Age, IEEE Transactions on Robotics, Vol. 32,Issue 6, 2016. PDF Scaramuzza, Zhang, Visual-Inertial Odometry of Aerial Robots, Encyclopedia of Robotics, Springer, 2019, PDF. Huang, Visual-inertial navigation: A concise review, International conference on Robotics and Automation (ICRA),2019. PDF. Gallego, Delbruck, Orchard, Bartolozzi, Taba, Censi, Leutenegger, Davison, Conradt, Daniilidis, Scaramuzza, Event-basedVision: A Survey, IEEE Transactions of Pattern Analysis and Machine Intelligence, 2020. PDF7

A Brief history of Visual Odometry & SLAM 1980: First known VO implementation on a robot by Hans Moraveck PhD thesis (NASA/JPL) for Mars rovers using onesliding camera (sliding stereo) 1980 to 2000: The VO research was dominated by NASA/JPL in preparation of the 2004 mission to Mars 2000-2004: First real-time monocular real-time VSLAM solutions (e.g., S. Soatto, A. Davison, D. Nister, G. Klein) 2004: VSLAM was used on a robot on another planet:Mars rovers Spirit and Opportunity(see seminal paper from NASA/JPL, 2007) 2015-today: VSLAM becomes a fundamental tool of several products:vacuum cleaners, scanners, VR/AR, drones, robots, smartphones 2021. VSLAM used on the Mars helicopter8

Recently founded VO & SLAM companies AI Incorporated SLAM for autonomy, softwareArtisense SLAM for autonomy, software and hardwareAugmented Pixels SLAM for mapping, softwareGEO SLAM SLAM for mapping, softwareIndoors SLAM for indoor positioning and mapping, softwareKudan SLAM for autonomy, softwareModelai SLAM hardware for dronesMYNT EYE Manufacturer of camera-IMU sensors, hardwareNAVVIS SLAM for mapping, software and hardwareRoboception SLAM for robot arms, software and hardwareSevensense SLAM for autonomy, software and hardwareSLAMCore SLAM for autonomy, software and hardwareSUIND SLAM for drone autonomy, software and hardwareVanGogh Imaging SLAM for object tracking and mapping, softwareWikitude SLAM for AR/VR, software9

A Short Recap of the last 40 years of VIO MachineLearning(from 2012)Robustness(adverse environment conditions, HDR, motion blur, lowtexture, dynamic environments)Efficiency(speed, memory, and CPU load) EventCameras(from 2014) IMU (from 2007)(10x accuracy)Feature Direct (from 2000)AccuracyFeature based (1980-2000)10

We need more datasets to evaluate the performance of SLAMRobustness(adverse environment conditions,HDR, motion blur, low texture) TartanAir Dataset 2021BlackBird 2018UZH-FPV dataset 2018DSEC 2021MVSEC 2018Event Camera 2017 Efficiency(speed, memory, and CPU load) SLAM Bench 3, 2019 HILTI-SLAM dataset 2021ETH 3D Dataset 2021TUM VI Benchmark 2021Devon Island 2013TUM-RGBD 2012KITTI 2012EuRoC 2016 AccuracyRealistic simulators: AirSim 2017FlightmareFlightGoggles 2019ESIM 2018 Algorithms are tunes to overfit datasets!We need a Common Task Framework!11

HILTI SLAM Dataset & Challenge 2 LiDARs 5 standard cameras 3 IMU Goal: Benchmark accuracy of structure and motionHelmberger et al., The Hilti SLAM Challenge Dataset, Arxiv Preprint, 2021, PDF. Dataset12

HILTI SLAM Challenge – Leader BoardUse predominantly LiDARUse predominantly visionHelmberger et al., The Hilti SLAM Challenge Dataset, Arxiv Preprint, 2021, PDF. Dataset

UZH-FPV Drone Racing Dataset & Challenge Goal: benchmarking VIO & VSLAM algorithms at high speed, where motion blur and high dynamic range are detrimental Recorded with a drone flown by a professional pilot up to over 70km/h Contains over 30 sequences with images, events, IMU, and ground truth from a robotic total station: https://fpv.ifi.uzh.ch/ VIO leader board: https://fpv.ifi.uzh.ch/?sourcenova-comp-post 2019-2020-uzh-fpv-temporary-leader-boardDelmerico et al. "Are We Ready for Autonomous Drone Racing? The UZH-FPV Drone Racing Dataset“, ICRA’19. PDF. Video. Datasets.14

UZH-FPV ChallengeSliding-window optimization a la OKVIS or VINS-MonoFilter-based (MSCKF)No event cameras have been used yet!Delmerico et al. "Are We Ready for Autonomous Drone Racing? The UZH-FPV Drone Racing Dataset“, ICRA’19. PDF. Video. Datasets.

Today’s Outline A brief history of visual SLAM SVO and real-world applications Active exposure control Event cameras16

SVO Key needs: low latency, low memory, high speed Combines indirect direct methods Direct (minimizes photometric error) Used for frame-to-frame motion estimation Corners and edgelets Jointly optimizes poses & structure (sliding window) Indirect (minimizes reprojection error) Frame-to-Keyframe pose refinement Mapping Probabilistic depth estimation (heavy tail Gaussian distribution) Faster than real-time (up to 400Hz): 400 fps on i7 laptops and100 fps on smartphone PCs (Odroid (ARM), NVIDIA JetsonsSource code of SVO Pro: https://github.com/uzh-rpg/rpg svo pro openEdgeletCornerForster, Zhang, Gassner, Werlberger, Scaramuzza, SVO: Semi Direct Visual Odometry for Monocular and Multi-Camera Systems,IEEE Transactions on Robotics (T-RO), 2017. PDF, code, videos.]17

SVO Key needs: low latency, low memory, high speed Combines indirect direct methods Direct (minimizes photometric error) Used for frame-to-frame motion estimation Corners and edgelets Jointly optimizes poses & structure (sliding window) Indirect (minimizes reprojection error) Frame-to-Keyframe pose refinement Mapping Probabilistic depth estimation (heavy tail Gaussian distribution)Probabilistic Depth Estimation Faster than real-time (up to 400Hz): 400 fps on i7 laptops and100 fps on smartphone PCs (Odroid (ARM), NVIDIA JetsonsSource code of SVO Pro: https://github.com/uzh-rpg/rpg svo pro openEdgeletCornerForster, Zhang, Gassner, Werlberger, Scaramuzza, SVO: Semi Direct Visual Odometry for Monocular and Multi-Camera Systems,IEEE Transactions on Robotics (T-RO), 2017. PDF, code, videos.]18

Processing time of SVO vs. ORB-SLAM, LSD-SLAM, DSOProcessing timesin millisecondsSVO front end is over 10x faster than state of the art systemsand 4x more efficient (runs on half a core instead of 2 cores CPU).This makes it appealing for real-time applications on embedded PCs (drones, smartphones)Forster, Zhang, Gassner, Werlberger, Scaramuzza, SVO: Semi Direct Visual Odometry for Monocular and Multi-Camera Systems,IEEE Transactions on Robotics (T-RO), 2017. PDF, code, videos.]19

SVO Pro (2021) – Just released - does full SLAM!Includes: Supports monocular, stereo systems as well as omnidirectionalmodels (fisheye and catadioptric) Visual-inertial sliding window optimization backend (modifiedfrom OKVIS) Loop closure via DBOW2 Global Bundle Adjustment or Pose-Graph optimization viaiSAM2 in real time (at frame rate)SVO Pro contains a full SLAM system running in real timeSource code of SVO Pro: https://github.com/uzh-rpg/rpg svo pro openForster, Zhang, Gassner, Werlberger, Scaramuzza, SVO: Semi Direct Visual Odometry for Monocular and Multi-Camera Systems,IEEE Transactions on Robotics (T-RO), 2017. PDF, code, videos.]20

Autonomous quadrotor navigation in dynamic scenes (down-looking camera)(running on Odroid U3 board (ARM Cortex A9 at 90fps)20 m/s obstacle free autonomous quadrotor flight at DARPA FLA (2015)Throw-and-go (2015)(inspired many products, like DJI Tello drone)Virtual Reality with SVO running on an iPhone 6(with company Dacuda at CES 2017)More here: http://rpg.ifi.uzh.ch/svo2.html21

Startup: “Zurich-Eye” – Today: Facebook-Oculus Zurich Vision-based Localization and Mapping systems for mobile robots Born in Sep. 2015, became Facebook Zurich in Sep. 2016. Today 200 employees22

Startup: “Zurich-Eye” – Today: Facebook-Oculus Zurich Vision-based Localization and Mapping systems for mobile robots Born in Sep. 2015, became Facebook Zurich in Sep. 2016. Today 200 employees In 2018, Zurich-Eye launched Oculus Quest (2 million units shipped so far)“From the lab to the living room”: The story behind Facebook’s Oculus Insight technology from Zurich-Eye to Oculus insight-technology/23

SVO and its derivatives are used today in many of products DJI drones Magic Leap AR headsets Oculus VR headsets Huawei phones Nikon cameras 24

Takeaway: Partner with industry to understand the key problems Industry provides use cases They have very stringent requirements: Low latencyLow energy (e.g., AR, VR, always-on devices): see NAVION or PULP chipsRobustness to HDR, blur, dynamic environments, harsh environment conditionsAccuracy: e.g., construction monitoring requires maps with 5mm absolute error25

Today’s Outline A brief history of visual SLAM SVO and real-world applications Active exposure control Event cameras26

HDR scenes are challenging for SLAM Cameras have limited dynamic range Built-in auto-exposure is optimized for image quality, not for SLAM!Idea: Actively adjust the exposure time27

Active Camera Exposure ControlStandard Built-in Auto-ExposureOur Active Exposure ControlZhang, Forster, Scaramuzza, Active Exposure Control for Robust Visual Odometry in HDR Environments, ICRA’17. PDF. Video.28

Active Camera Exposure ControlStandard Built-in Auto-ExposureOur Active Exposure ControlZhang, Forster, Scaramuzza, Active Exposure Control for Robust Visual Odometry in HDR Environments, ICRA’17. PDF. Video.29

Takeaway: make your algorithm scene aware!Cameras have many parameters that can be adaptively tuned or actively controlled to achieve thebest performance Scene aware exposure-time control [Zhang, ICRA’17] Scene aware motion-blur & rolling shutter compensation [Meilland, ICCV’13] [Liu,. ICCV’21] More generally, we need scene aware, continuous self-calibration & parameter control30

Today’s Outline A brief history of visual SLAM SVO and real-world applications Active exposure control Event cameras31

Open Challenges in Computer VisionThe past 60 years of research have been devoted to frame-based cameras butthey are not good enough!Latency & Motion blurDynamic RangeEvent cameras do not suffer from these problems!32

What is an event camera? Novel sensor that measures only motion in the scene Key advantages: Low-latency ( 1 μs)No motion blurUltra-low power (mean: 1mW vs 1W)High dynamic range (140 dB instead of 60 dB)Traditional vision algorithms cannotbe directly used becauseasynchronous pixelsVGA event camera from Propheseeeventcameraoutput:Video from here33

Opportunities Low latency: AR/VR, automotive ( 10ms) Low energy: AR/VR, always-on devices (see Synsense) HDR & No motion blur34

Who sells event cameras and how much are they? Prophesee & SONY: ATIS sensor: events, IMU, absolute intensity at the event pixel Resolution: 1M pixels Cost: 5,000 USD Inivation & Samsung DAVIS sensor: frames, events, IMU. Resolution: VGA (640x480 pixels) Cost: 5,000 USD CelePixel Technology & Omnivision: Celex One: events, IMU, absolute intensity at the event pixel Resolution: 1M pixels Cost: 1,000 USD Cost to sink to 5 when killer application found(recall first ToF camera ( 10,000 USD) today 50 USD)35

Generative Event Model Consider the intensity at a single pixel. An event is triggered when the log intensity change passes a threshold 𝐶:log 𝐼 𝒙, 𝑡 log 𝐼 𝒙, 𝑡 Δ𝑡 𝐶log 𝐼(𝒙, 𝑡)𝐶 Contrast ���𝑁𝑂𝑁𝑂𝑁𝑂𝐹𝐹Notice that events are generated asynchronously𝑂𝐹𝐹 𝑂𝐹𝐹 𝑂𝐹𝐹36

Do events carry the same visual information as normal cameras?From the event-generation model, we can reconstruct images up to an unknown intensity valueEventsMunda, IJCV’18Scheerlinck, ACCV’18Results are far from perfect mainly due to contrast threshold being not constant(depends on scene content).Can we learn video from events end to end?37

Can we learn to reconstruct video from events?EventsReconstructed video from eventsThe video reconstruction is now very accurate because the network learns animplicit noise modelRebecq et al., “High Speed and High Dynamic Range Video with an Event Camera”, T-PAMI’19. PDF Video Code38

Learned from Simulation only – One-Shot Recurrent neural network based on Unet Trained in simulation only, deployed on a real event camera without fine tuning We randomize the contrast sensitivity to reduce sim-to-real gap Generalizes to real and different event cameras without fine tuningSource code & Datasets: https://github.com/uzh-rpg/rpg e2vidRebecq et al., “High Speed and High Dynamic Range Video with an Event Camera”, T-PAMI’19. PDF Video Code39

Reconstructed video inherits all advantages of event cameras:e.g., high temporal resolutionBullet: 1300 Km/hHuawei P20 phone cameraOur reconstruction from events at over 5,000 fpsSource code & Datasets: https://github.com/uzh-rpg/rpg e2vidRebecq et al., “High Speed and High Dynamic Range Video with an Event Camera”, T-PAMI’19. PDF Video Code40

Reconstructed video inherits all advantages of event cameras:e.g., high dynamic rangeRaw eventsOur reconstruction from eventsHuawei P20 phone cameraSource code & Datasets: https://github.com/uzh-rpg/rpg e2vidRebecq et al., “High Speed and High Dynamic Range Video with an Event Camera”, T-PAMI’19. PDF Video Code41

What happens if we feed reconstructed video to a state-of-theart SLAM algorithm?The SLAM inherits all the advantages of event cameras: no motion blur, HDR, low-latency!Rebecq et al., “High Speed and High Dynamic Range Video with an Event Camera”, T-PAMI’19. PDF Video Code42

The Key Challenge The fact that we can reconstruct high quality video means that event cameras carry the same visualinformation as standard cameras So it must be possible to perform all vision tasks of standard cameras But we want to build efficient and low energy algorithms that compute the output without passingthrough intermediate image reconstructionImagereconstructionCV algorithmOutput43

Application 1: Low-Latency & Low-Energy Tracking [1] Gallego et al., Event-based 6-DOF Camera Tracking from Photometric Depth Maps, T-PAMI’18. PDF. Video.[2] Mueggler et al., Continuous-Time Visual-Inertial Odometry for Event Cameras, TRO’18. PDF[3] Rosinol et al., Ultimate SLAM?, RAL’18 Best Paper Award finalist PDF. Video. IEEE Spectrum.[3] Gehrig et al., EKLT: Asynchronous, Photometric Feature Tracking using Events and Frames, IJCV 2019. PDF, YouTube, Evaluation Code, Tracking Code44

Application 2: “Ultimate SLAM”Goal: combining events, images, and IMU for robustness to HDR and high speed scenariosFront End:Feature tracking from Events and ion-based VIORosinol-Vidal, Rebecq, Horstschaefer, Scaramuzza, Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed45Scenarios, IEEE Robotics and Automation Letters (RAL), 2018 – PDF. Video. Best Paper Award Honorable Mention

Application 2: “Ultimate SLAM”85% accuracy gain over standard VIO in HDR and high speed scenariosStandard cameraEvent cameraRosinol-Vidal, Rebecq, Horstschaefer, Scaramuzza, Ultimate SLAM? Combining Events, Images, and IMU for Robust Visual SLAM in HDR and High Speed46Scenarios, IEEE Robotics and Automation Letters (RAL), 2018 – PDF. Video. Best Paper Award Honorable Mention

Application of Ultimate SLAM: Autonomous Flight despite Rotor Failure Quadrotors subject to full rotor failure require accurate position estimates to avoid crashing SOTA systems used external position tracking systems (e.g., GPS, Vicon, UWB) We achieve this with only onboard cameras. With event cameras, we can make it work in very low light!Sun, Cioffi, de Visser, Scaramuzza, Autonomous Quadrotor Flight despite Rotor Failure with Onboard Vision Sensors: Frames vs. Events, IEEE RAL’2021.PDF. Video. Code. 1st place winner of the NASA TechBrief Award: Create the Future Contest47

Application 3: Slow Motion Video We can combine an event camera with an HD RG camera We use events to upsample low-framerate video by over 50 times with only 1/40th ofthe memory footprint!Code & Datasets: http://rpg.ifi.uzh.ch/timelensTulyakov et al., TimeLens: Event-based Video Frame Interpolation, CVPR’21. PDF. Video. Code.48

Application 3: Slow Motion Video We can combine an event camera with an HD RG camera We use events to upsample low-framerate video by over 50 times with only 1/40th ofthe memory footprint!Code & Datasets: http://rpg.ifi.uzh.ch/timelensTulyakov et al., TimeLens: Event-based Video Frame Interpolation, CVPR’21. PDF. Video. Code.49

Application 3: Slomo Video We can combine an event camera with an HD RG camera We use events to upsample low-framerate video by over 50 times with only 1/40th ofthe memory footprint!Code & Datasets: http://rpg.ifi.uzh.ch/timelensTulyakov et al., TimeLens: Event-based Video Frame Interpolation, CVPR’21. PDF. Video. Code.50

Application 4: Event-Guided Depth Sensing Problem: Standard depth sensors (ToF, LiDAR, Structured Light) sample the depth uniformly and at a fixedscan rate, thus oversampling redundant static information large power consumption and high latency Idea: use event camera to guide depth measurement process: scan with higher spatial density areasgenerating events, and with lower density the remaining areas Finding: since moving edges correspond to less than 10% of the scene on average, event-guided depthsensing could lead to almost 90% less power consumption by the illumination sourceMuglikar et al., Event Guided Depth Sensing, 3DV’21. PDF51

Conclusion Visual Inertial SLAM theory is well established Biggest challenges today are reliability and robustness to: HDR, low light, Adverse environment conditions, motion blur, low-texture, dynamic environments Active control of camera parameters, like exposure time, can greatly benefit Machine learning exploits context & provides robustness. Best way to use is to combine it with geometric approaches. Event cameras are complementary to standard cameras and provide: Robustness to high speed motion and HDR scenes Allow low-latency and low-energy which is key for AR/VR and always-on devices Current SLAM datasets are saturated: new datasets & challenges are needed (CommonTask Framework)!52

Thanks!Code, datasets, videos, and publications, slides: http://rpg.ifi.uzh.ch/I am hiring PhD students and Postdocs in AIailabRPG@davsca1@davidescaramuzza

A Brief history of Visual Odometry & SLAM 1980: First known VO implementation on a robot by Hans Moraveck PhD thesis (NASA/JPL) for Mars rovers using one sliding camera (sliding stereo) 1980 to 2000: The VO research was dominated by NASA/JPL in preparation of the 2004 mission to Mars 2000-2004: First real-time monocular real-time VSLAM solutions (e.g., S. Soatto, A. Davison, D. Nister .

Related Documents:

WEDDING CATERING MENUS EVENTS SLAM EVENTS 314.655.5387 WWW.SLAM.ORG @SLAM.EVENTS Danielle Dolan Photography. EVENTS RENTAL OPTIONS Congratulations on your upcoming wedding! Your celebration at the . All catering orders are subject to a 23% service charge. PLATINUM RECEPTION PACKAGE 165 per person. Choice of Two Entrees Vegetarian or .

consisting of moment frames, and structures with peripheral moment frames or structures with moment frames in some parts and simply supported frames in others, are moment resisting frames too. In this system, concrete or steel moment resisting frames can be used as ordinary, intermediate or special ductility frames.

Fig. 1. Agent-based localization SLAM system. A fusion between different sensors (LiDAR, IMU, camera and GPS) is performed for achieving real-time indoor/outdoor SLAM. Left: Agent wearing the proposed system. Right: 3D Map (blue), trajectory (red) and 3D ofine reconstruction obtained by the proposed system in an indoor/outdoor environment .

for fixed-bottom offshore wind farms. Further, the CFD simulations report lower slam forces than all of the reduced-order models considered here. Considering the comparison to CFD, the Goda slam model appears to be the least conservative and the Cointe-Armand and Wienke-Oumerachi slam models are the most conservative.

SLAM Dunk Program - Proper Segregation August 2014. SLAM Dunk and SafeProduction If we follow SafeProduction, it makes it really easy. Plan to place the waste in the right bin by learning what goes where and having the right bins accessible. Accept that proper segregation is the right thing to do.

in urban street scenes with vision sensors has attracted much attention in recent years. Current state-of-the-art methods such as ORB-SLAM [4] or LSD-SLAM [5], [6] demon-strate consistent large-scale trajectory estimation and 3D reconstruction. While these methods can cope with a limited number of moving objects as outliers to the SLAM process,

INDIA: ARTS FOR ALL, SLAM OUT LOUD. EDCONT-040 - 23/11/2020 . EDUCATION CONTINUITY DURING THE CORONAVIRUS CRISIS: A JOINT INITIATIVE BY THE WORLD BANK, THE OECD, HARVARD GLOBAL EDUCATION INNOVATION INITIATIVE AND HUNDRED . Implementation challenges . The key challenges faced by Slam Out Loud, along with how they were mitigated is captured .

16 CVPR14: Visual SLAM Tutorial Michael Kaess Sparsification: Factor Graph Node Removal Control complexity of performing inference in graph - Long-term multi-session SLAM - Reduces the size of graph - Storage and transmission Graph maintenance - Forgetting old views slide by Nick Carlevaris-Bianco and Ryan Eustice ICRA 2014