Open Networking Summit 2017
April 3-6, 2017 - Santa Clara Convention Center

Click here for more information and to register. 
Back To Schedule
Tuesday, April 4 • 10:30am - 11:20am
How Does Uber Check the Health of its Data Center (DC) and Cloud Infrastructures Every 10 Sec? - Vasileios Lakafosis, Uber

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.

One of the major challenges and requirements in achieving a very high (>99.99%) reliability of operation of any major network infrastructure (i.e. data center, enterprise, campus, etc.) is the ability to design and deploy an always-on active system that performs end-to-end functional  testing of all the network-connected infrastructure components and, as a result, monitors the infrastructure and its dependent external services with high accuracy and granularity (down to  the packet level) in the most efficient way; consuming the least amount of computational or network resources. 

When it comes to packet loss detection, 
metrics reported by the original manufacturers cannot be relied upon; their tools may either be buggy or, in most cases, do not provide APIs for extracting measurements. Therefore, we needed to create our own tool; this is the gap Arachne is filling.

In this talk, we present Arachne. Arachne is a packet loss detection system and an underperforming path detection system. It provides fast and easy active end-to-end functional testing of all the components in Data Center (DC) and Cloud infrastructures. Arachne is able to detect intra-DC, inter-DC, DC-to-Cloud, and DC-to-External-Services issues by generating minimal traffic.

avatar for Vasileios Lakafosis

Vasileios Lakafosis

Senior Software Engineer, UBER

Tuesday April 4, 2017 10:30am - 11:20am PDT
Grand Ballroom F
  Enterprise - DevOps
  • Experience Level Any