NANOG 64 Notes: 2015

Wednesday, June 3, 2015

NANOG Xirrus wifi

Purchased wifi solution for NANOG to own based on Xirrus, based on RFP process.

1,700 unique devices connected to wifi this session

Xirrus has been on site the whole time trying to find ways to utilize spectrum. Had bad cable drops, bad hardware, bad patch cables.

(Wifi has sucked repeatedly. Lots of speed shifts, lots of stalled sessions. Have toggle macbook wifi off/on several times in an attempt to get a working connection)

NANOG is not blaming Xirrus and expects great things in the future. Expect no "issues" for next meeting.

Next meeting ought to be amusing...

Evolution of Ethernet Speeds - What’s New and What’s Next

Evolution of Ethernet Speeds - What’s New and What’s Next
- See more at: https://www.nanog.org/meetings/abstract?id=2576#sthash.nANbxPbO.dpuf

In this presentations we'll talk about the latest Ethernet developments that are bringing a variety of new technology to the market for different applications with speeds ranging from 2.5 GE to 400 GE. We'll take a look at the new 2.5 GE, 5 GE and 25 GE speeds, 2nd generation 40 GE and 100 GE, 400 GE and what's possible in the future.

Expect 802.11ax to go four times faster than 802.11ac.

"If we go for the more conservative 4x estimate, and assume a massive 160MHz channel, the maximum speed of a single 802.11ax stream will be around 3.5Gbps (compared with 866Mbps for a single 802.11ac stream). Multiply that out to a 4×4 MIMO network and you get a total capacity of 14Gbps." http://www.extremetech.com/computing/184685-what-is-802-11ax-wifi-and-do-you-really-need-a-10gbps-connection-to-your-laptop

NBASET - "everyone except Broadcom"

For 25Gbps there is an SFP28 - same size as 10Gig.

40Gbps - done and in good shape. Popular in DCs with breakout cables. There is now a 40km SM interface.

100Gbps - in 2nd generation. 1M 100GE ports projected in 2016. In "early majority" phase of market adoption. (OSI 100Gbps transceivers are down to $20K currently.) QSFP28 down to 3.5W for 100Gbps transceiver. Currently four different vendor MSAs to do short reach (2km or 500m) for 100Gbps. Market will sort it out. 100Gbps signaling is still just on/off signaling.

400Gbps uses complex modulation - can't blink the light on/off fast enough with current electronics. 802.3bs task force to develop interfaces for 400Gbps. 400GBASE-SR16 - 16 x 25Gbps over parallel MMF. strong desire to support 10km.

Expect 100Gbps standard by 2017, first interfaces some time after that. Initial modules CDFP for short reach. CFP2 (old 100Gbps module) which has 8x50Gbps electrical interface.

Some time in 2020+ expect serial signaling at 400 Gbps, which will make terabit possible by combining multiple 400 Gbps serial flows.

Could see some new standards around 50 Gbps signaling as a result of work around 25Gbps.

"We are still a ways away from 400 Gbps serial signaling, but will get there eventually."

"At higher speeds and longer distances (beyond the standard) it gets into optical company secret sauce and you will have to deal with the vendors for 40 and 80km."

Q: what about MTU negotiation in auto-negotiation? A: no interest in IEEE to define maximum frame size due to installed base. Customers could insist on it if they want. (Note that standard Ethernet max frame size is now 2k, which holds 1500B of data and 482B of vlan ID/tags/labels/whatever)

New Cybersecurity Obligations and CPNI Rules Represent Regulatory Sea-Change for Network Operators

New Cybersecurity Obligations and CPNI Rules Represent Regulatory Sea-Change for Network Operators

- See more at: https://www.nanog.org/meetings/abstract?id=2590#sthash.JrR4drGe.dpuf

The early months of 2015 have seen an unprecedented level of action in the realm of U.S. cybersecurity policy. The Obama Administration, in response to a growing number of cybersecurity compromises and data breaches has announced an aggressive cybersecruity and data security agenda. The activity appears motivated at lease in part by the spike in the number of U.S. data breaches in 2014.

The Cybersecurity agenda of the Obama Administration will have direct and indirect consequences for network operators potentially adding to their already substantial regulatory burdens. On top of these new obligations, the FCC adopted new open Internet rules that for the first regulate how data Internet service providers can utilize Consumer Proprietary Network Information (CPNI). That's a big deal to network operators who supplement revenue from user fees in a variety of ways by trafficking in user data. This presentation will educate the audience on the new rules and outline compliance strategies.

Major push by Obama admin since got into Whitehouse, with lots of activity in 2015.
Lots of concern about sharing information about breaches due to liability, publicity, etc.
Obama pushing BCP, but so far voluntary.

No privacy law in US for citizens, instead this is done industry by industry, controlled by FTC to ensure that companies are meeting their published privacy statements. Kludgy and incoherent.
Need baseline protections for consumers, also being pushed by Obama.

Q: Why do you believe that FCC open access will not be struck down? A: Decision by court reviewing FCC rules was extremely clear and listed elements so well that difficult to challenge. (previous speaker was also of that opinion). Q. Agree that re-classification into common carriage is right thing to do, but is it defensible this time because it was done properly this time? A: The DC Circuit addresses most agency appeals, next step is Supreme Court. DC Circuit listed everything that FCC needed to do to meet legal requirements for common carriage rules and FCC met all of those requirements this time. So feeling is that it will withstand Supreme Court and even unlikely for challenge to get to that level.

Overview of DDoS types

Review of DDoS types being seen today.

This talk covers the principles and particular implementations of DDoS. It goes in detail as to what are the bottlenecks that are generally exploited/overloaded, the attack types and the solutions to those. - See more at: https://www.nanog.org/meetings/abstract?id=2584#sthash.cxHtfoNO.dpuf

Perception is that DDoS is some magical event and that you cannot anticipate or deal with. But that is not true and there are approaches you should take for mitigation. Present mitigation approaches in cost/benefit matrix to mgmt so that they can decide what SLA they need/want and how much they are willing to spend.

Architecture for fine-grain, high-resolution Telemetry for network elements

Architecture for fine-grain, high-resolution Telemetry for network elements. - See more at: https://www.nanog.org/meetings/abstract?id=2574#sthash.iWOQO3RG.dpuf

The Networks evolve quickly to be highly automated, self-adapting, and intelligent integrated systems. However, even the most intelligent system can take as good decision as good input information it is provided with..

In this session we address an architecture that enables high frequency export of telemetry data from network elements. Traditional protocols like SNMP retrieve data from network elements using a “pull” model, which suffers from several well documented shortcomings, the most important being a centralized architecture which causes strain on the central processor of the network element, taking away cycles from the main functions of the router.

This architecture addresses the problem by creating a distributed export mechanism where telemetry is “pushed” out directly from the source, rather than relying on the central component. This is intended to enable innovative applications, such as dynamic provisioning of devices based on utilization levels, security and/or quality prediction base don anomaly heuristic analysis, etc. The architecture defines the following

1. The Open telemetry model for a network element.
- Configuration and Provisioning
- Capability discovery
2. Implementation of telemetry probes in various internal sub systems of a network element.
3. Open-source based Export mechanisms via which telemetry data is generated

Yet another discussion about how broken network monitoring and management is. Maybe it's time for NANOG to organize a user revolt against vendors. Come up with a list of what customers need and tell vendors that they need to meet the new requirements. How about config file management with commit testing and rollback Cisco?

SDN in the Management Plane: OpenConfig and Streaming Telemetry

SDN in the Management Plane: OpenConfig and Streaming Telemetry

See more at: https://www.nanog.org/meetings/abstract?id=2573#sthash.y5xYlSmA.dpuf

The networking industry has made good progress in the last few years on developing programmable interfaces and protocols for the control plane to enable a more dynamic and efficient infrastructure. Despite this progress, some parts of networking risk being left behind, most notably network management and configuration. The state-of-the-art in network management remains relegated to proprietary device interfaces (e.g., CLIs), imperative, incremental configuration, and lack of meaningful abstractions.

We propose a framework for network configuration guided by software-defined networking principles, with a focus on developing common models of network devices, and common languages to describe network structure and policies. We also propose a publish/subscribe framework for next generation network telemetry, focused on streaming structured data from network elements themselves.

All the usual problems: CLIs are proprietary, scaling sucks, screen-scrape sucks, etc.

However, these are SDN shops, so they can change stuff in software and stop waiting for vendors to never fix things. New world order. But wait! Didn't they hear that Cisco is a software oriented company?!

Propose: "model-driven network management"
1. topology
2. configuration
3. telemetry

Telemetry:
SNMP default choice. Old protocol, choices were made to conserve limited resources of the time.
Asking for requirements on a new telemetry protocol to replace SNMP.
gRPC, Thrift, or transport buffers over UDP (sounds like SNMP)
streaming telemetry is the goal, ask you vendors

Config:
OpenConfig effort - informal industry collaboration (operators, not vendors). Motivated by lack of abstractions and programmability, all the usual litany of complaints people have been repeating for years and vendors have not been addressing.
Focus on vendor-neutral configuraiton and operational state models - adopted YANG DML (rfc 6020)
Weekly meetings, github repository
No legal structure to this group, nothing formal, trying to avoid that overhead (and political layer)
Their approach is operations specific, don't want to care about standards politics
Looks like the IETF and other groups are now being ignored when people need to get work done. Yet another new world order.

"It's time for the management plane to join the age of SDN"

They do not intend to play game according to vendor's rules. "The fact that there is a different set of commands for BGP across vendors and devices is ridiculous and there is no reason for it that we can see."

Q: Why don't vendors adopt approaches used by server admins, puppet, chef, etc. Why not use the same technology? A. tools for server management may not apply across the board to more complex pieces of network gear with complex configs. However, the models are key, and models can use JSON encoding and with puppet/chef.

Q. Why push this in this way? There are a number of repositories for YANG stuff. Why is what you are doing is different? A. we are publishing models into some of the same repositories, We are also working with IETF, however this is consumer view and operators view so our perspective is different.

Q. are you positing to say IETF is not effective forum for operators to work with? A. we are trying hard to work with ietf and it hasn't been easy.

Q. how can we avoid doing MIBs all over again in new formats? A. operator-based view, pushing vendors to support a base model, trying to keep lean. vendors should be able to add their own stuff, but should be programmable unlike past. Q. when dealing with vendors if you give them an inch and they will take a mile. A. we are trying to make base model as complete as possible, so challenge to vendor is what do you need that isn't in base?

Q. Igor response - IETF moves very slowly. IETF structured into very different hierarchies. Creates one guidance for structure for every hierarchy. "make everything look the same" and so we are avoiding that, and we will be feeding this back to ietf with information drafts. Basic answer: ietf is too slow and their org is not helpful.

Source Routing 2.0. Why Now, Why Again?

Source Routing 2.0. Why Now, Why Again?

Traditional source routing using IP header options was never widely deployed due to security concerns. Recent buzz around Segment Routing (a.k.a. SPRING) has re-invigorated interest in source routing technologies and their potential benefits. For many operators however, moving to SPRING represents a significant change in their operating practices, so in a more incremental approach they are implementing SPRING-inspired designs using current technologies with minor augmentations.

In this talk we will review SPRING/SR, but will mainly focus on using existing protocols for achieving similar benefits. We will discuss: - clever usage of static LSPs to achieve predictable label values in a data-center network - minor enhancements to BGP-LU for more resilient EPE (Egress Peer Engineering) - interoperability considerations between SPRING and non-SPRING domains

See more at: https://www.nanog.org/meetings/abstract?id=2585#sthash.kH6HJqky.dpuf

Traditional approach, putting source routed headers in packets. All deprecated in both v4 and v6. RFC5095 pretty much bans routing headers in packets.

If put into tunnel and source route the tunnel then people find that acceptable: MPLS TE with EROs.

New interest in "segment routing" to tunnel packets from src to dst by describing route in the header as a sequence of "segments." Arises out of SDN / controller world view.

Appears to assume a closed universe whose endpoints are owned/operated by same entity. Example: VXLAN tunneling in DC environment.

Google created a mesh of RSVP LSPs that cover all paths, creating a lot of extra state in their network. Then they were able to analyze perf across all LSPs to monitor state of large complex mesh inter and intra Google DCs. This approach adopted by other big providers using static LSPs to avoid RSVP state overhead. Then use SPRING/source routing to push packets across specific paths for monitoring and analysis.

Could also use across MPLS paths in DC. He seems to think that using MPLS in the DC is controversial. Presumably because VMware thinks they own the space with VXLAN?

Lightning talk Tuesday: DNS KSK must be replaced

Geoff Huston gave a lightning talk on Tues eve about the fact that the Key Signing Key for DNSSEC has reached end of lifetime (5 years) and needs to be replaced.

https://blog.apnic.net/2015/05/22/the-dns-root-zone-key-signing-key-is-changing/

Issues:

The problem is that a roll of the Root Zone KSK has never been done before. While there is a standard specification of how a resolver can update its local copy of the KSK (documented in RFC 5011) it’s not clear how many DNSSEC-validating resolvers support this standard. Those resolvers that don’t support RFC5011 will be left stranded with the old KSK value and will no longer operate as intended until an operator reloads the resolver with the new KSK value.

The key roll also involves a period of slightly larger responses from the Root Zone, of up to 1,425 octets. This should not present a major issue, but it is above the 1,232 octet maximum unfragmented DNS payload in IPv6, and there are some concerns relating to UDP fragmentation in IPv6 and the fallback to TCP that have yet to be quantified.

Given these unknowns, this roll of the KSK is going to need to be handled carefully for DNSSEC to continue to operate properly for the pool of 750 million users who already rely on it.

Motivation, Analysis, and Architecture for IPv4aaS

Motivation, Analysis, and Architecture for IPv4aaS - See more at: https://www.nanog.org/meetings/abstract?id=2572#sthash.nEGQdBEn.dpuf

In this talk, we share our work in creating an “IPv4 as a service” network overlay. We begin by suggesting there is value in building network infrastructure which is “lean” and "IPv6 focused". There is likely value in focusing on a lean FIB infrastructure and IPv6 focused functionality in our next-generation routing platforms. We performed traffic analysis on how IPv4 is being used in our network today. We found that 90% of the traffic in our fully routed backbone is done by %0.005 of the routes. 99% of the traffic is performed by 4.5% of the prefixes. This data suggests we can incrementally deploy an IPv4aaS solution. Our goal is to build an IPv4aaS using cloud infrastructure based on open source and home grown software. We then present an IPv4aaS built on top of LISP. While we use LISP encapsulation, we have decided not to use the LISP DDT “routing” mechanism. Our IPv4aaS overlay routing architecture must associate IPv4 prefixes with IPv6 next-hops. This isn’t supported with “classic” BGP today. To solve this, we augmented JSON BGP IPv4 prefix updates with additional JSON information, namely an IPv6 next-hop. Effectively, we created a BGP IPv4 update with an IPv6 next-hop value. These messages are HTTP PUT to a route reflector / controller device, which processes the updates and applies associated business rules. These messages are then HTTP PUT to the overlay ingress (iTR) which announce a “default” to the underlay network. We then suggest the concepts in this presentation are an extension to the concepts we've made at the previous two June NANOG meetings.

Discussion of IPv4 overlay on top of IPv6 over time. So wants to start thinking about how that should look like before it happens. Assumption is that the world moves to v6 and optimizes for v6 and then keeps v4 running somehow.

Concepts: Want to have a "lean core" to reduce costs for hardware, mgmt, etc. That means v6-only core as the endgame, so need to think about getting there over time.

575K prefixes in current FIB. Based on studies, 72 percent of prefixes being advertised are not being used. Only about 160K prefixes have traffic. 415K prefixes have no measurable traffic!

Comcast analysis: to carry 90 percent of v4 load need 3,156 prefixes. About 2,300 Comcast, 900 Internet.

Thinking about building overlay/underlay, looks like don't need that many prefixes for the v4 overlay.

99 percent point is 25,893 prefixes (from comcast view of world) - about 6K are comcast prefixes.
So 99 percent of Internet is carried on about 20K prefixes. Very interesting....

Use LISP to virtualize v4 delivery in the cloud. IPv4aaS as overlay in VMs on cloud infrastructure. LISP ETRs and ITRs close to customer ingress/egress. Overlay would be primarily outbound service.

Peering provider can identify which prefixes with significant load and forward via main hardware path. The rest are punted to LISP overlay. Heavy hitter prefixes routed on underlay. The rest carried over overlay.

"openstack isn't quite ready, still too data-center-centric and needs more work to get isolated from DC problem space and become more general for using in this model of overlay/underlay.

Control plane evolution to JSON and HTTP - (NANOG 2013 New Orleans - Applying Web principles to the network)

Open router platforms: (NANOG 2014 Bellevue - open router platforms)
Randy Bush: ask vendors for reduction of prefixes that is possible by de-duplicating information

QUIC: Next generation multiplexed transport over UDP

QUIC: Next generation multiplexed transport over UDP

See more at: https://www.nanog.org/meetings/abstract?id=2586#sthash.PMHBdN2A.dpuf

QUIC is a new transport protocol that runs on top of UDP. It is currently deployed to a significant fraction of Chrome users of Google services. It combines TCP-style congestion control, TLS-grade security, and HTTP/2 multiplexing. Our measurements on real-world traffic show that QUIC significantly decreases page load times and time between pauses in video playbacks. Google will continue scaling up the use of QUIC on the public internet over the next few months, with the goal of sending all Google traffic over QUIC. This talk will discuss how QUIC works, why it's an improvement over TCP, the problems we're anticipating as we deploy at scale, and what we're doing to prevent them. - See more at: https://www.nanog.org/meetings/abstract?id=2586#sthash.PMHBdN2A.dpuf

New Google proto to replace TCP for Web access. Incorporates TCP Cubic, adds signaling improvements such as retransmissions have new sequence number. Improved congestion control.

Lots of things people want to add to TCP, but it's hard to do. Selective ack, SACK, took 20 years to get to 90 percent deployment in TCP.

QUIC uses port 443. QUIC combines handshake for connection with security and you can send request with first transaction in the first round trip. Lots faster than TCP. QUIC very comparable to TLS, always encrypted. zero RTT handshake. Plans to adopt TLS 1.3 when complete. Signed proof of address.

peering.google.com has QUIC info: https://peering.google.com/about/quicfaq.html

Zero RTT is the biggest single feature - over 50 percent better perf. Better loss recovery as well. 10x fewer timeout based retransmissions vs TCP.

2014: chrome stable, rollout started Jan 2015.
On client side, if UDP is blocked then fall back to HTTP/TCP.
If path MTU too small (QUIC uses 1350B packet) then falls bacl to TCP
There is a Chrome flag to disable

Google can disable QUIC on a per-ASN basis if needed. Don't want to do it, but sometimes have to because of degraded quality of experience reports in Youtibe (most common indicator).

Need to sign up for access to per-ASN QUIC info: https://peering.google.com/portalaccess/

There is a wireshark plugin for QUIC.
Chrome will display QUIC stats with: chrome://net-internals/#quic

Future: Forward error correction. - Experimenting with it, but hard.
Connection mobility - starting to work on that and multipath
Looking at improvements in congestion control. Kept TCP congestion control the same in QUIC for the moment.
Working to support in Apache Traffic Server
Google Chrome and some Google Android Apps are current main users of QUIC

QUIC was presented to IETF in 2013
Mar 2015: QUIC crypto was presented and has informed TLS 1.3 development and will subsume QUIC crypto as a profile in TLS 1.3
Working to move QUIC zero RTT handshale into TLS1.3 so can adopt TLS 1.3 into QUIC
QUIC is open source- all in Chromium source. No separate library so it's all in Chrome at this point.
QUIC designed to improve user experience, lower latency, faster response.

Q: constant encryption is a cost in CPU why choose to do so? A: have reduced CPU load and we (google) choose to encrypt everything.

Note: QUIC will fail at sites with firewalls blocking UDP. So the firewalled world is exacting yet another cost in latency and quality of experience (not that they care)

Rethinking Centralization

Centralization seems to be the current "silver bullet" for network complexity — but will it really "solve" for complexity, or just bring us another round of "pushing complexity around?" This presentation considers an alternative view of the centralization/decentralization debate. - See more at: https://www.nanog.org/meetings/abstract?id=2571#sthash.QlDGUQ1K.dpuf

Noticed over the years that we keep adding layers to try to manage complexity. But we end up doing roughly the same things over and over again at each layer.

Why not lose all the layers and drop the whole mess into the control plane? Just layer the control plane. (VXLAN?)

Theory is that the control plane already has the best possible scalability protocols for reachability etc. - BGP, etc.

Where does it make sense to have an API vs a protocol?

Can polling ever outrun local reaction to an event? No. Local reaction is always faster.

Should policy be expressed in terms of forwarding?

Is a unified view of the network actually possible? We may be getting to the point of using heuristics vs algorithms.

What should be do about failure domains?

Are we ready to throw away 20+ years of experience to "try something new" (totally centralized -- his answer is "no")

Basically he is in favor of decentralization and not SDN.

Whither forward? Support YANG/NETCONF and I2RS
Think about existing protocols for southbound: BGP, PCEP

Has lived through "centralization vs decentralization" several times and thinks it is the wrong debate.

Dave Meyer: Is complexity always a bad thing. But if you want robustness, you need complexity. It is gratuitous complexity that you don't want. Try to understand what complexity is adding robustness and what isn't. Discussion about layering is on-target. Layering is one way to build the good type of complexity because it helps you to isolate and understand.

Control of complexity comes down to policy dispersion - who controls and where?

Tuesday, June 2, 2015

Dynamic Service Chaining for SDN/NFV

"The modern service provider and data center networks demand cloud delivery model for agile and cost-effective rollout of services for revenue generation. There are some proposals to incorporate service insertion in the emerging SDN and NFV environments. However, the current methods are sub-optimal, complex and inflexible in delivering automated end-to-end service delivery.

In a virtualized cloud environment, service delivery requires intelligence in the network for policy-based traffic handling and differentiated services. Dynamic service chaining is a fundamental component in building an on-demand and scalable model for policy enforcement. This session will cover following topics:

1. NFV & Service chaining use cases
2. Challenges with existing solutions
3. Opportunities with dynamic service chaining
4. Policy Enforcement model in SDN/NFV environment
5. Dynamic Service Chaining Architecture & Design considerations
6. Benefits of Dynamic Service Chaining"
See more at: https://www.nanog.org/meetings/abstract?id=2595#sthash.a7BOBUqX.dpuf

Network Automation and Programmability Abstraction Layer

Network Automation and Programmability Abstraction Layer with Multivendor support - See more at: https://www.nanog.org/meetings/abstract?id=2588#sthash.EurgXQJg.dpuf

"The Python library dubbed Napalm - short for Network Automation and Programmability Abstraction Layer with Multivendor support - helps interact with devices from various vendors though a unified set of commands. It currently supports Arista, Juniper, Cisco (IOS-XR) and FortiGate. It provides methods to replace or merge the configurations on the devices, view a diff, commit the changes, or roll them back. It also provides an Ansible plugin, which enables administrators to push configurations directly from Ansible playbooks. Napalm is open source and available on Github: https://github.com/spotify/napalm We will present more details about Napalm, how to use it and show in a demo how to instantaneously configure a set of multivendor devices."

Notes:
Based on Ansible server mgmt system.Requires the ability to run an ansible agent on the network device. Works on Juniper and Arista. Not on Cisco IOS or NX-OS.

Works on IOS-XR with python library - need to look at library and see what they did to make it work with XR and whether that could be adapted to IOS or NX-OS. https://github.com/fooelisa/pyiosxr/
Device must support atomic changes and provide a config change interface. Would need to write library to emulate that capability since not supported in IOS or NX-OS.

Changes are idempotent, logs for everything done. Can get diffs prior to commit. Ansible provides a sophisticated framework for changes, and napalm python scripts apply configs to JunOS and EoS boxes.

Latency IPv4 vs IPv6: Understanding the difference

During last years much attention was paid for process of deploying IPv6 in different regions. And with growing IPv6 adoption the requirements also grew up. Today we require not just working IPv6 but reliable IPv6. Of course this aim brings to our attention IPv6 network latency which we want to be comparable to IPv4 latency. First measurements gave community very interesting and promising results – for some networks IPv6 was faster than IPv4. But why IPv6 had less latency – there was no clear evidence. In my report I’m going to discuss reasons why in some cases IPv6 is faster than IPv4 and why it is slower in other ones. - See more at: https://www.nanog.org/meetings/abstract?id=2591#sthash.aLuWKSDh.dpuf

Tried to understand claims of faster IPv6 performance. Assumed that it must arise from different network paths. Found more paths in v4 vs v6. "Comparing different graphs" (where "graph" means "set of network paths")

Decided to compare connectivity for v6 vs v4 hosts. Found v6 is a subgraph of v4, as expected. Found that most paths going through Hurricane Electric - connectivity of HE is better than anyone else in v6.

Question: Is HE Tier-1? Who is Tier-1? No one will explain who is a member of Tier-1 club.

So comparisons of v6 speed are comparisons of HE performance at this time.

The benefits of deploying IPv6 only

"Use of IPv6 is gaining momentum. For some companies it is the primary IP version of choice to run their businesses and to deliver products and services to their customers. The depletion of globally routable IPv4 and private IPv4 address space has prompted many technology firms to look at the use of IPv6 while abandoning support for IPv4 in some parts of their networks. Additionally, and perhaps, more interestingly IPv6 has been rumored to outperform IPv4. New data is available from more diverse adopters that further substantiates these claims."

Notes:

FB is IPv6 internally. Networking said "we're out of RFC1918, time to move."

Decided to make it all v6 and no v4 to avoid having to manage two stacks and to commit TCAM to both network representations. Simpler for them to go with one system. Converted all apps to IPv6 and are entirely IPv6 now.

Facebook claims that their metrics are showing better performance for IPv6 vs IPv4. They speculate that the absence of middleboxes (NATs, etc.) is a probable cause. They are going to look into it more and force users to use happy eyeballs to see if these results hold up. iOS is v4/v6 capable so they are using that as testbed. FB still trying to get to quad-A NS records and will have them in a few months. Took them several years to get to IPv6.

FB metrics show different graphs of IPv6 traffic vs v4 because corporate american doesn't have v6. Over 12 percent of FB traffic is v6. iOS will only select v6/happy eyeballs 20 percent of the time over a given path.

FB "multiproxygen" will use v6 80 percent of the time because of faster perf.

FB app has its own stack. FB wrote it to avoid having network perf issues across multiple platforms. Same stack they use on all their own servers. Write once, use everywhere. They were able to make IPv6 work before support on various platforms. FB hopes to opensource this. Apple is treating v6 like a second class citizen, but FP app works they way FB wants WRT v6 because of the internal stack.

Interesting result of this session is that big outfits, FB and other providers, have gone to v6 internally and shut off v4 when they can to simplify and avoid costs. Must be nice to have total control of your internal environment.

Tutorial: NetOps Coding 101

"David Swafford is a Network Engineer on Facebook's Production Engineering team. Having coming to FB a few years back as traditional network engineer, he quickly shifted focus to the software side after realizing that there simply wasn't enough time in the day to do things the traditional way. That shift has paid off well for FB as the entire network team shifted in the same manner and now we focus on much more interesting problems together. When nerding out the most, he loves to look deep into the things that make networking and large systems tick. Topics like routing protocols, TCP, and Linux fall into his favorite bedroom reading. Outside of all this, he loves to to cook any and everything under the theme of Italian-American following in the footsteps of his favorite chef Scott Conant. He also loves being active with his dog Cocoa. Best reached using daswafford@gmail.com, you can also find him regularly at the SF Python Meet-ups (www.meetup.com/sfpython/). Any and all levels are welcome!

Abstract: Note: this is a single 3-hour session spanning two back-to-back tutorial slots. No coding / Python experience needed or assumed.

Do you find yourself wanting to write a robot but not quite sure where to start? Ever find yourself typing the same set of commands on the CLI all day long? If so, this is the class for you!

In this session, we'll focus on scripting in Python as a way to tackle a few common problems faced by all of us in the network world -- issues that typically steal your attention, but don't have to anymore. We'll take a practical approach to automating the humans and give you the foundation to go out and build your own version of Facebook's FBAR (https://www.facebook.com/notes/facebook-engineering/making-facebook-self-healing/10150275248698920).

This session will be targeted to the scripting n00b -- assuming no prior coding or Python knowledge. While we won't have time to cover a proper "Intro to Python", relevant syntax will be explained as we walk through the session and you'll be armed with the knowledge to go out and build!

One tip! We will be working from a virtual machine of Ubuntu Desktop throughout the session. A copy will be distributed on USB sticks, but if you would prefer a direct link, please download DEVBOX01.zip from https://drive.google.com/folderview?id=0BweIzzj21fzEfllZbnJOQXoxbmIyeHJjS2VZWFY3ajBsMEZXZFBlTVhTTTFnQjJuWnhTM0k. This VM was built using Virtual Box and will open natively, but has also been tested / confirmed to work under VMware. VirtualBox is a free download available from https://www.virtualbox.org/wiki/Downloads"

Tutorial materials:
https://drive.google.com/folderview?id=0BweIzzj21fzEfllZbnJOQXoxbmIyeHJjS2VZWFY3ajBsMEZXZFBlTVhTTTFnQjJuWnhTM0k.

Notes:
Excellent intro to Python tutorial organized by the Facebook engineer who made this presentation on Monday

Tutorial provides a copy of VirtualBox and a Linux VM pre-loaded with iPython IDE. Next, the tutorial steps through the process of developing Python regex library for parsing syslog files, and then taking action on them. This is a minimal/intro version of what they do at Facebook to help run their network: https://www.facebook.com/notes/facebook-engineering/making-facebook-self-healing/10150275248698920

Monday, June 1, 2015

Research Networking Track

Training Network Administrators in a Game-Like Environment - Murat Yuksel

Training with game system to provide practice for ISP networking on things like load balancing configurations. On average could get 15-20% improvement in results (better configurations) as a result of practicing with a network game for configuration. Researcher's primary ask: what training should we focus on? Which tasks?

Water, not Land [A perspective on IPv4 runout] - Julie Percival
Attempt to use social science modeling to look at IPv6 deployment issues. However, pretty major tech disconnect on the details so the model might be useful but it wasn't proven. Nonetheless, it's a good idea, and clearly the developers of IPv6 did not meet their goals for deployment. So what went wrong is a good question, and it's interesting that this is the first NANOG in which I have heard that question asked.

Evaluating Network Buffer Size requirements for Very Large Data Transfers - Michael Smitasin
ESnet / Nat'l lab traffic patterns. Lots of labs have 100Gbps links and ESnet is going to 100Gbps core. Petabyte/sec generating 50Gbps and new requirements will be 180Gbps.
Bufferbloat: Premise is that big buffers cause higher latency with bad impact. But this is for mouse flows vs elephant flows.

Large decrease in throughput over long distance for very small losses.
Need large enough buffers for common microburst conditions, but no more.
So, is it possible to detect proper buffer size?
Packet burst size of 65, no losses ever seen. 300 packet burst size will see packet loss if link is congesting and buffers are being used.
Linux kernel 3.1.1 and above supports fair queuing - just enabling FQ made significant improvement in throughput over long paths.
Used iperf3 to generate loads. Used nuttcp to simulate burst traffic
schedFQ in Linux has packet pacing that may be what is going on
FQ is promising but cannot replace sufficient buffers
Can we say how big buffers should be? You need to test and find out.

http://fasterdata.es.net
http://people.usuc.edu/~warner/buffer.html

Used reno stack

Pacific Research Platform - John Hess and Camille Crittenden

CENIC initiatives. CA networking to schools, etc. 3,800mi of optical fiber, 10,000 sites, 20M daily users.
New NSFnet-funded initiative called CITRIS center for IT research in the interest of society. 6 of 10 UC campus, new IT solutions with social benefit. Health, energy and environment, robotics, connected communities nanotech.

Science DMZ/High-performance network security [time-permitting] - Michael Sinatra - See more at: https://www.nanog.org/meetings/abstract?id=2563#sthash.8UO4wyXx.dpuf

Monitoring, managing and troubleshooting large scale networks

See more at: https://www.nanog.org/meetings/abstract?id=2562#sthash.TFxOMqQp.dpuf

Monitoring, managing and troubleshooting large scale networks. Almost four years ago I came to NANOG and mostly complained about the state of monitoring networks, par for the course for me. A lot has changed since then, we've solved many of the problems I addressed. Perhaps more importantly, we've fundamentally changed how we manage, monitor and troubleshoot our network. We plan to share what we learned, what went well, and best of all, what went oh so terribly wrong. Our driving philosophy behind this effort is that by taking an engineering approach to operations, you can greatly reduce the time to discover, mitigate and resolve issues on your network. We analyzed our faults, our pain points and the work that consumed most of our time. This allowed us to prioritize what we tackled first, we were surprised by what we learned caused the most outages, and how much impact minor network issues can have when they fall in the right place. From this, today, the majority of the faults that occur in our network are automatically detected, and mitigated all without human intervention.

Notes:
Facebook engineering with some great war stories on microbursts and FB apps that violate TCP backoff behavior.

They had a DB app that would detect congestion and move traffic to other links. So they would find congestion events that migrated to other network bundles between other pairs of gear.

Had to fix the algorithm in the DB app (Tau) and fixed the bizarre congestion issue that behaved liked a broken TCP state machine. Took them months to years to find and resolve. Issue 1: service centers were complaining and network group not finding issue first.

- Created new detection suite to look for loss and latency (they re-invented smokeping alerting) They found that acceptable loss of TCP/IP was .1%

Tiny amount of loss could result in 50 percent reduction in throughput at minimal RTT

Any loss is seen as congestion, and TCP aggressively backs off. Tried various TCP algos - Vegas worst, Illinois best, 4x difference in perf

Next - DB team reported that they couldn't recover from network outages - wouldn't ramp back up. Found that diff algo in recover was 15x - reno didn't recover, cubic recovered rapidly.

Facebook deployed cubic everywhere.

Tuned their detection system to lower amount of alerts - interface issues are primary to look at. They have a set of tools that locates interface issues and automatically resolve them by moving traffic to other links and open a ticket. Auto-resolution. No new change control or SLAs, no new staff.

Set of scripts called "FBAR" - processes 3.6B syslog msgs, 1 percent are real issues, then FBAR interacts with FB devices 750K times per month and 97 percent resolution.

Would need 150 engineers to solve the same set of issues in the same amount of time.

https://www.facebook.com/groups/netengcode/

SENSS - Proposed Security Service

SENSS: Security Service for the Internet

"Distributed network attacks, such as DDoS and BGP prefix hijacking can severely hurt online businesses and disrupt critical infrastructure services. The main challenge in handling such attacks is their distributed nature: the best locations to diagnose and mitigate them are often far from the victim's network. Today's Internet has no automated mechanism for victims to ask help of remote ISPs, and has low incentives for remote ISPs to offer such services. Consequently, prefix hijacking attacks go largely unmitigated, and victims of DDoS attacks pay exorbitant prices to large CDNs to distribute their contents and thus sustain the attacks. We propose SENSS, a programmable security service for the Internet. SENSS brings simple and generic programmable interfaces from SDN to inter-AS security. These interfaces can be easily implemented in today's ISPs; victims use them to observe and control their own traffic and routes in remote ISPs, and pay per use. We show how victims can leverage these simple interfaces to design solutions against many attacks. We provide six such custom programs that handle a variety of DDoS and BGP prefix hijacking attacks, many of which are not handled today. We evaluate SENSS through extensive simulations and prototype implementation, using realistic traffic and Internet topology, and show that it is very effective in sparse deployment (with adoption in 20 large ISPs, SENSS can eliminate 80-96\% DDoS attack traffic and correct 92--99\% of polluted ASes for BGP prefix hijacking), and it has low message overhead and delay."

See more at: https://www.nanog.org/meetings/abstract?id=2568#sthash.9jII6qdm.dpuf

Home Access Congestion Points

Home Network or Access Link? Locating Last-mile Downstream Throughput Bottlenecks - See more at: https://www.nanog.org/meetings/nanog64/agenda#sthash.jjudtWT6.dpuf

"As home networks see increasingly faster downstream throughput speeds, a natural question is whether users are benefiting from these faster speeds or simply facing performance bottlenecks in their own home networks. In this work, we study a simple question: Are users’ downstream throughput bottlenecks in their home networks or in their access ISPs? Although our question is simple, answering it on a large-scale deployment is challenging because accurate measurements require instrumentation of a home router, which is too resource-constrained for many existing throughput analysis tools. In this work, we identify lightweight metrics that can accurately identify whether a throughput bottleneck lies inside or outside a user’s home network and develop a detection algorithm that accurately locates these bottlenecks. We validate this algorithm in controlled settings and report on two deployments, one of which is the FCC deployment with 2,652 homes in the United States. We find that wireless bottlenecks are more common than access-link bottlenecks—particularly for home networks with downstream throughput greater than 20 Mbps, where access-link bottlenecks are relatively rare. In homes with a wireless bottleneck, it is equally likely that only one device experiences the bottleneck as it is for all devices to experience the wireless bottleneck simultaneously."

Locates analysis on the home router - useful for carriers that provide a home router and are willing to instrument it. Uses data to try to discover where the bottlenecks may be located. Uses pcap to collect 10K pps samples about 3x per hour to avoid load on limited home router resources. Looks at the level of packets in buffers to detect "bottleneck"

First platform was Netgear WNDR3700v2 - 64 homes worldwide
Second test Netgear WNR 3500L, fewer resources but got into 2600 homes in US (FCC assisted with this?)

Found significant bottleneck 40-50 percent of the time across most users. Conflates both access link and wifi.

Found that most issues are wifi overload. As soon as carrier throughput increases above 20 Mbps, then bottleneck moved from access link to wifi link. Wifi bottlenecks at 20 Mbps.

Were able to determine on one of the router platforms with two radios that 2.4ghz was an issue and that 5 ghz performed better. Note that ATT residential gateway does not provide 5ghz. So "gigapower" is being delivered to most customers over 2.4ghz.

Internet Order - FCC regulations

Internet Order – a regulator’s read of the FCC’s recent Open Internet Order, and related regulatory proceedings. "Chris Witteman is a senior litigation attorney for the California Public Utilities Commission. His talk will focus on the FCC's February 26, 2015 Open Internet Order, and provide a digestible overview of the Order's 400 pages, and what it might mean in practice for consumers, competitive carriers, and content and service providers. The talk will focus on the “No throttling,” “No paid prioritization,” and “No unreasonable interference” regulations adopted by the FCC, and how the FCC is likely (or not) to interpret and enforce those regulations. Attention will be paid to interconnection, transit, and peering issues that many believe are integral to ensuring an open Internet: how will the FCC monitor and approach those issues under the Order, and how might they do so in the future?" - See more at: https://www.nanog.org/meetings/nanog64/agenda#sthash.jjudtWT6.dpuf

Notes:

Interesting take on recent FCC thrashing on trying to figure out Internet rules for data carriage, how badly they screwed up in 2005, and how they appear to be recovering with the new rules on open Internet.

"Paid prioritization is a business decision and cannot be allowed. Not regarded as network management." Cannot interfere with consumer's access to lawful content. However, FCC did not get involved in details of peering, and defer to a "reasonableness" interpretation (full employment for lawyers) and best practices and tech standards (involves NANOG and BCPs).

NANOG 64 - meeting and attendance counts

NANOG 64 notes

Largest NANOG so far - 1172 attendees:
https://www.nanog.org/meetings/nanog64/attendees

Big crowds in hallways, serious congestion, was wishing for some weighted RED with discard at some points.

As always, lots of interesting stuff and things I didn't know. Very clear that the center of gravity WRT activities and developments has moved to places like Google/Netflix/Facebook et al.

212 ISPs at this meeting
174 Network Operators
160 Vendors
then all the rest

So NANOG is still about ISP/operators, but the operators / developers are focused on the big dogs of Goog, FB, etc.

And that's a wrap. Three solid days in conference rooms: Much Geek, so Technical (wow)
(Yes, I am seriously behind the curve in my Internet memes.)