网络基础 · GitBook

link管理

链接快照平台

输入网页链接，自动生成快照
标签化管理网页链接

交换机（Switch）

交换机是二层（数据链路层）设备，是目前常见的网络设备，允许多台计算机连接到它，由于是二层的设备，交换机可以识别 Ethernet 协议中的属性来识别特点的计算机，确保数据准确的传输到特定的机器。这极大的减少了网络域冲突，提高了网络传输的吞吐量.

路由器（Router）

Hub 是一层的网络设备，交换机是二层的网络设备，二路由器是三层的网络设备，路由器知道如何在不同的网络之间发送数据。和交换机检测 Ethernet 协议中的属性决定将包发送到什么位置类似，路由器检测 IP 协议中的属性决定将包发送到什么位置。路由器内部有个路由表，包含着将数据路由到世界上不同网络的信息。

不同的路由器之间通过 BGP(Border Gateway Protocol) 协议共享数据，这使数据的发送基于最佳路径。当您打开Web浏览器并加载网页时，计算机和Web服务器之间的流量可能会经过数十个不同的路由器。互联网异常庞大且复杂。路由器是将流量吸引到正确位置的全球

一根网线通常通过一个 RJ-45 接头连接到一个 RJ-45 网络接口. 网络接口通常与组成计算机网络的设备直接连接，或是设备的一个部分，例如任何一台计算机都会至少有一个网络接口。

配线架是一种包含许多网络端口的设备，但没有其他作用，只是将不同网络线缆连接到一起。

一个比特(bit)代表计算机可以明白的最小数据，它要么是 1，要么是 0。这些在网络链路上发送的 0 或 1 的字节码串是组成数据帧、数据包的最底层元素，这些帧或包是其他网络层的概念。

0 或 1 的字节码串在网络上发送是通过一个叫调节器(Modulation)的程序控制，调节器(Modulation)是一种改变电荷在电缆上移动的电压的方式。当用于计算机网络时，这种调制方式更具体地称为线路编码。它允许链路两端的设备了解某种状态下的电荷为 0，而另一种状态下的电荷为 1。

PSTN(Public Switched Telephone Network) is also referred to as the POTS(Plain Old Telephone Service).

A dial-up connection uses POTS for data transfer, and gets its name because the connection is established by actually dialing a phone number.

Modem stands for modulator/demodulator, and they take data that computers can understand and turn them into audible wavelengths that can be transmitted over POTS.

A baud rate is a measurement of how many bits could be passed across a phone line in a second.

What is broadband?

In terms of internet connectivity, it’s used to refer to any connectivity technology that isn’t dial-up Internet. Broadband Internet is almost always much faster than even the fastest dial-up connections and refers to connections that are always on. This means that they’re long lasting connections that don’t need to be established with each use. They’re essentially links that are always present.

T-Carrier Technologies

T-Carrier Technologies were originally invented by AT&T in order to transmit multiple phone calls over a single link.

T1 stands for Transmission System 1.

A T1 communicates at speeds of 1.544 Kb/sec.

A T3 is just 28 multiplexed T1 lines.

DSL(digital subscriber line) was able to send much more data across the wire than traditional dial-up technologies.

DSL technologies use DSLAMs or Digital Subscriber Line Access Multiplexers to establish data connections across phone lines.

ADSL - ADSL stands for Asymmetric Digital Subscriber Line. ADSL connections featured different speeds for outbound and incoming data. Generally, this means faster download speeds and slower upload speeds.

SDSL - SDSL stands for Symmetric Digital Subscriber Line. SDSL technology is basically the same as ADSL, except the upload and download speeds are the same.

FTTN means fiber to the neighborhood that fiber technologies are used to deliver data to a single physical cabinet that serves a certain amount of the population.

FTTB stands for fiber to the building, fiber to the business or even a fiber to the basement, since this is generally where cables to buildings physically enter.

FTTH stands for fiber to the home, that is used in instances where fiber is actually run to each individual residents in a neighborhood or apartment building.

FTTP fiber to the premises, FTTH and FTTB may both also be referred to as FTTP.

The most common specifications for how wireless networking devices should communicate, are defined by the IEEE 802.11 standards. This set of specifications, also called the 802.11 family, make up the set of technologies we call WiFi.

A frequency band is a certain section of the radio spectrum that’s been agreed upon to be used for certain communications.

WiFi networks operate on a few different frequency bands. Most commonly, the 2.4 gigahertz and 5 gigahertz bands. There are lots of 802.11 specifications including some that exist just experimentally or for testing. The most common specifications you might run into are 802.11b, 802.11a, 802.11g, 802.11n, and 802.11ac.

WEP stands for Wired Equivalent Privacy, it’s an encryption technology that provides a very low level of privacy.

WPA stabds for Wi-Fi Protected Access, by default, uses a 128-bit key, making it a whole lot more difficult to crack than WEP.

WPA2, an update to the original WPA. WPA2 uses a 256-bit key make it even harder to crack.

数据链路层实质上是对物理层的一个抽象，使其他层可以不用考虑物理层所使用的硬件或设备，而可以进行发送数据和接收数据的工作，这样确保了不管物理层硬件或设备如何变化，网络层、传输层、应用层都在用同样的方式工作。

以太网协议(Ethernet)

Ethernet 是目前链路层最被广泛使用的协议，用来在单个链路上发送数据，Ethernet 最早在 1980 年提出，1983 成为标准，后续只是基于带宽的增加相应进行过一些微调。

Ethernet 是当今现有局域网（Local Area Network LAN）采用的最通用的通信协议标准，定义了局域网中采用的电缆类型的信号处理方法。

Ethernet 是建立在CSMA/CD(Carrier Sense Multiple Access With Collision Detection, 载波侦听多路访问/冲突检测)机制上的广播型网络。

CSMA/CD(Carrier Sense Multiple Access With Collision Detection, 载波侦听多路访问/冲突检测) - CSMA/CD 用于确定通信通道何时畅通以及设备何时自由传输数据，这是为了避免冲突域。

CSMA/CD 的工作原理就是检测当前网段上是否有数据传输，如果没有，则发送数据；如果有，则等待一个随机的时间间隔，然后尝试再次发送数据，CSMA/CD 使用 MAC 地址来确认目的地节点。

MAC 地址分类

计算机网络的区域，它由每个单台计算机或与网络连接设备组成，该区域可以通过向数据链路层的广播地址发送简单帧来直接访问。整个能够直接访问的区域称为广播域，广播域是网络上的一个逻辑部分，这部分网络中的任意设计不经过路由（网关）可直接向任意其他一个设备发送数据。一个广播域中的所有设备或节点位于同一个 LAN 或 VLAN。

一个 LAN 或 VLAN 中不同网络节点通信通常包括：

The first part of an Ethernet frame is known as the preamble. A preamble is 8 bytes or 64 bits long and can itself be split into two sections. The first seven bytes are a series of alternating ones and zeros. These act partially as a buffer between frames and can also be used by the network interfaces to synchronize internal clocks they use, to regulate the speed at which they send data. This last byte in the preamble is known as the SFD or start frame delimiter. This signals to a receiving device that the preamble is over and that the actual frame contents will now follow.

Destination MAC Address - 目的地接收地址硬件的物理地址；

Source MAC Address - 以太网帧发送端的物理地址；

Type Field - 16 二进制长度，以太网类型标识字段，包括帧的内容；例如 tag 标记的 VLAN，该字段为 802.1Q。

Payload - Data payload of an Ethernet frame. A payload in networking terms is the actual data being transported, which is everything that isn’t a header. The data payload of a traditional Ethernet frame can be anywhere from 46 to 1500 bytes long.

CRC(cyclical redundancy check) checksum, which is a 4-byte or 32-bit number that represents a checksum value for the entire frame.

通常一个 2 层交换设备相当于一个 LAN，而VLAN 是一个虚拟的广播域，通常是在一个二层交换设备中创建，当然现代多数 2 层交换设备之间可以相连和交换，不同交换机下的 VLAN 可以位于同一个 VLAN 中。

VLAN 可以有效的降低广播负载，提高网络性能；传统上引入 VLAN 可以进行更好的安全、分组等控制，而且大多数 2 层交换机厂商的设备默认就有 VLAN 的划分；另外，新的 SDN（软件定义的网络）只有 VLAN 的概念。

不同 VLAN 内的报文在传输时相互隔离，即一个VLAN内的用户不能和其它VLAN内的用户直接通信。

不同 VLAN 之间的通信是通过路由设备

运行STP协议的设备通过彼此交互信息发现网络中的环路，并有选择地对某个接口进行阻塞，最终将环形网络结构修剪成无环路的树形网络结构，从而防止报文在环形网络中不断循环，避免设备由于重复接收相同的报文造成处理性能下降。

STP是一个用于局域网中消除环路的协议:

链路聚合就是把多个链路聚合在一起，多个网络接口抽象出一个逻辑的网络接口，链路聚合的目的是增加链路的带宽，例如 4 条 100 MB 的链路聚合后带宽是 400 MB。除了增加带宽外，链路聚合还可以增加容错，例如当一条链路不可用不会影响整体聚合链路的可用性。聚合的链路总数通常是 2 的 N 次方（2，4，8）。

不同厂商链路聚合技术各异，F5 BIG-IP 采用 trunk 代表一组网络接口的抽象，基于 trunk ，在 F5 BIG-IP 中最多可聚合 8 条链路。F5 BIG-IP trunk 会有一个独立的 MAC 地址，该地址用来和 pee 进行通信。

LACP(Link Aggregation Control Protocol) 链路聚合控制协议是 IEEE 标准 802.3ad 定义，用来检测链路的错误，重传等机制，以确保聚合的链路可靠，容错。

不同网路厂商 LACP 行为不同，例如，Linux Bonding 就是一种软件 LACP 实现，可以灵活配置负载分发方式等。F5 BIG-IP 系统中 LACP 是一个可选配置，可以自定制 LACP 行为，如各个链路数据传输权重等，还可以配置一些控制策略。

链路聚合可以分为手工模式和LACP模式。

Table 7. 链路聚合两种模式

采用LACP协议的一种链路聚合模式。设备间通过链路聚合控制协议数据单元(Link Aggregation Control Protocol Data Unit，LACPDU)进行交互，通过协议协商确保对端是同一台设备、同一个聚合接口的成员接口。

LACP模式下，两端设备所选择的活动接口数目必须保持一致，否则链路聚合组就无法建立。此时可以使其中一端成为主动端，另一端(被动端)根据主动端选择活动接口。

Frame Control - Frame control field is 16 bits long, and contains a number of sub-fields that are used to describe how the frame itself should be processed.

Duration ID - It specifies how long the total frame is. So, the receiver knows how long it should expect to have to listen to the transmission.

Address - There are four address fields, because there needs to be room to indicate which wireless access point should be processing the frame. So, we’d have our normal source address field, which would represent the MAC address of the sending device.

Sequence Control - Sequence control field is 16 bits long and mainly contains a sequence number used to keep track of ordering the frames.

Data payload - Data payload section which has all of the data of the protocols further up the stack.

FCS - Frame check sequence field which contains a checksum used for a cyclical redundancy check.

传统 IP 路由转发

传统的 IP 转发采用的是逐跳转发。数据报文经过每一台路由器，都要被解封装查看报文网络层信息，然后根据路由最长匹配原则查找路由表指导报文转发。各路由器重复进行解封装查找路由表和再封装的过程，所以转发性能低。

传统 IP 路由转发的特点:

PPP提供了安全认证协议族PAP(Password Authentication Protocol，密码验证协议)和CHAP(Challenge Handshake Authentication Protocol，挑战握手认证协议)。

PPP协议具有良好的扩展性，例如，当需要在以太网链路上承载PPP协议时，PPP可以扩展为PPPoE。

PPP协议提供LCP(Link Control Protocol，链路控制协议)，用于各种链路层参数的协商，例如最大接收单元，认证模式等。

PPP协议提供各种NCP(Network Control Protocol，网络控制协议)，如IPCP(IP Control Protocol ，IP控制协议)，用于各网络层参数的协商，更好地支持了网络层协议。

PPPoE(PPP over Ethernet，以太网承载PPP协议)是一种把PPP帧封装到以太网帧中的链路层协议。PPPoE可以使以太网网络中的多台主机连接到远端的宽带接入服务器。

PPPoE集中了PPP和Ethernet两个技术的优点。既有以太网的组网灵活优势，又可以利用PPP协议实现认证、计费等功能。

PPPoE实现了在以太网上提供点到点的连接。PPPoE客户端与PPPoE服务器端之间建立PPP会话，封装PPP数据报文，为以太网上的主机提供接入服务，实现用户控制和计费，在企业网络与运营商网络中应用广泛。

PPPoE的常见应用场景有家庭用户拨号上网、企业用户拨号上网等。

Header Length - Header Length 字段长度为 4 个二进制位，代表着整个 header 的长度。如果是 IPv4，则 Header 的长度永远都是 20，事实上，20 个字节是 IP header 的最小长度，你不能在小于 20 自己的空间里合适的描述一个 IP Header。

Service Type - Service Type 字段长度为 8 个二进制位，用来指定 QoS 技术的详细情况。QoS 的作用是允许路由器作出决策，在一系列 IP 数据报文中，选择出最为重要的一个数据报文。

Total Length - Total Length 字段长度为 16 个二进制位，用来表示 IP 数据报文的整体长度。单个数据报文的最大长度为 16 个二进制位都为 1，即为 65,535。

Identification - Identification 字段长度为 16 个二进制位，用来将消息分组在一起，当要发送的数据大于单个数据报文允许的最大值时，则 IP 层需要将原始的大的数据包分割成几个小的数据包，在这种情况下 Identification 字段用来被接收端标识分割后的数据包属于同一个数据包。

Flag - Flag 字段用来标识数据报文是否允许分段，或者标识数据报文已经分段。

Fragmentation - 是将一个大的 IP 数据报文分割成多个小的数据报文的进程。

TTL - TTL 字段的长度为 8 个二进制位，指定一个数据报文在经过多少个路由跳转后丢弃。

Protocol - Protocol 字段的长度为 8 个二进制位，包含数据标识那个传输层的协议被使用，最常见的传输层协议是 TCP 或 UDP。

Header Checksum - Header checksum 字段用来对整个 IP 数据报文 header进行校验，它和 Ethernet Checksum 字段类似，通常由于 TTL 字段经过任意一个路由器时都会被修改，Header Checksum 字段相应的也会被修改。

Source IP Address - 长度为 32 个二进制位，代表着源 IP 地址。

Destination IP Address - 长度为 32 个二进制位，代表着目的地 IP 地址。

Option - 可选的字段，用来设定一些特定字符，通常用于测试目的。

Padding - 相当于一个占位符字段，由于 Option 字段时可选的一个变量，长度不定，该字段只是一些 0 串，用来确保 Header 的整体长度。

第一位十进制数用来做网络地址，后面三位十进制数用来做主机地址；以二进制表述，第一位以 0 开头，即二进制范围为 00000000 - 01111111

0.0.0.0 - 127.255.255.255

前两位十进制数用来做网络地址，后面二位十进制数用来做主机地址；以二进制表述，第一位以 10 开头，即二进制范围为 10000000 - 10111111

128.0.0.0 - 191.255.255.255

64000

前三位十进制数用来做网络地址，后面一位十进制数用来做主机地址；以二进制表述，第一位以 110 开头，即二进制范围 11000000 - 11011111

192.0.0.0 - 223.255.255.255

Class D

以二进制表述，第一位以 1110 开头，用于多播通信，即一个 IP数据报文可以发送到多个地址

224.0.0.0 - 239.255.255.255

Class E

以二进制表述，第一位以 1111 开头，预保留分类，供以后使用

240.0.0.0 - 255.255.255.255

不可路由的地址

不可路由的地址空间是一些 IP 范围，可以被任何人使用，但是不能路由。不是每台每台连接到 Internet 的计算机都需要能够与其他连接到 Internet 的计算机进行通信，不可路由的地址为这一需求而定，此类节点构成的网络他们可以相互通信，但没有网关路由器会尝试将流量转发到此类网络。

对应 IPv4地址范围，不可路由的地址空间主要有三个范围：

Table 10. 不可路由的IPv4地址

IPv4 子网划分

“有类编址”的地址划分过于死板，划分的颗粒度太大，会有大量的主机号不能被充分利用，从而造成了大量的IP地址资源浪费。因此可以利用子网划分来减少地址浪费，将一个大的有类网络，划分成若干个小的子网，使得IP地址的使用更为科学。那么我们来看一下如何完成子网划分。

如果一个 IPv4 地址属于 A 类或 B类地址，则可能存在的最大主机较多，这就需要子网来进一步分组成较小的网络，这就叫做子网。

子网掩码长度也为 32 位二进制数，通常由 4 组十进制数组成，4 组十进制数之间通过圆点连接，二进制表述，子网掩码由连续的 1 和连续的 0 构成，通常子网掩码由十进制表述，例如下表为一些子网掩码二进制和十进制示例：

Table 11. 子网掩码

CIDR(classless inter-domain routing，无类别域间路由)采用IP地址加掩码长度来标识网络和子网，而不是按照传统A、B、C等类型对网络地址进行划分。

CIDR容许任意长度的掩码长度，将IP地址看成连续的地址空间，可以使用任意长度的前缀分配，多个连续的前缀可以聚合成一个网络，该特性可以有效减少路由表条目数量。

ICMP(Internet Control Message Protocol)是IP协议的辅助协协议，用来在网络设备间传递各种差错和控制信息，对于收集各种网络信息、诊断和排除各种网络故障等方面起着至关重要的作用。

ICMP packet Struct:

Type - Type field is eight bits long which specifies what type of message is being delivered.

Code - Code field indicates a more specific reason for the message than just the type.

Checksum - Checksum is 16 bit length, that works like every other checksum field in other frame, like Ethernet frame, IP datagram and TCP segment.

Rest of header - A 32 bit field with an uninspired name, this field is optionally used by some of the specific types and codes to send more data.

Data payload - Data payload for an ICMP packet exists entirely so that the recipient of the message knows which of their transmissions caused the error being reported.

$ arp -e -i eth1
Address                  HWtype  HWaddress           Flags Mask            Iface
192.168.33.1             ether   0a:00:27:00:00:05   C                     eth1
server.example.com       ether   08:00:27:c3:0f:80   C                     eth1

$ sudo tcpdump -vvv -nn -w arp.cap -i eth1 arp
$ tcpdump -r arp.cap
reading from file arp.cap, link-type EN10MB (Ethernet)
02:09:34.895590 ARP, Request who-has server.example.com tell 192.168.33.101, length 46
02:09:34.895609 ARP, Reply server.example.com is-at 08:00:27:c3:0f:80 (oui Unknown), length 28

5. 服务器端查看 MAC 地址，验证与上面第 3 步中客户端 ARP 表中以及第 4 步抓包中获取的 MAC 地址是否相同

$ ip addr show eth1 | grep ether
    link/ether 08:00:27:c3:0f:80 brd ff:ff:ff:ff:ff:ff

6. 详细分析 ARP 请求包

$ route -nv
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.1.10.2       0.0.0.0         UG    0      0        0 external
0.0.0.0         10.1.1.1        0.0.0.0         UG    9      0        0 mgmt
10.1.1.0        0.0.0.0         255.255.255.0   U     0      0        0 mgmt
10.1.10.0       0.0.0.0         255.255.255.0   U     0      0        0 external
10.1.20.0       0.0.0.0         255.255.255.0   U     0      0        0 internal
127.1.1.0       0.0.0.0         255.255.255.0   U     0      0        0 tmm
127.7.0.0       127.1.1.253     255.255.0.0     UG    0      0        0 tmm
127.20.0.0      0.0.0.0         255.255.0.0     U     0      0        0 tmm_bp

Linux 上 ip route 查看路由信息

$ ip route list
default via 10.1.10.2 dev external
default via 10.1.1.1 dev mgmt  metric 9  mtu 1500
10.1.1.0/24 dev mgmt  proto kernel  scope link  src 10.1.1.245
10.1.10.0/24 dev external  proto kernel  scope link  src 10.1.10.240
10.1.20.0/24 dev internal  proto kernel  scope link  src 10.1.20.240
127.1.1.0/24 dev tmm  proto kernel  scope link  src 127.1.1.254
127.7.0.0/16 via 127.1.1.253 dev tmm
127.20.0.0/16 dev tmm_bp  proto kernel  scope link  src 127.20.0.254

路由表中路由条目获取有三种方式：

IGP(Interior Gateway Protocol 内部网关协议) - IGP 通常是在一个自治系统（Autonomous system, AS，一个，有时是多个实体管辖下的所有 IP 网络和路由器的全体，例如一个企业/组织的内网）内路由器共享信息

EGP(Exterior Gateway Protocol 外部网关协议) - EGP 是自制系统之间路由器共享信息。

距离矢量路由协议是一个旧的标准。使用距离矢量路由协议的路由器获取其路由表，该路由表包括它已知的每个网络的列表，以及这些网络在跳数方面的距离。然后路由器将此路由表发送给每个相邻路由器，即是直接与其相连的每个路由器。在计算机科学中，列表称为向量，这就是距离矢量协议的名字由来。借助距离矢量路由协议，路由器实际上对自治系统的总体状态了解不多，他们只是了解一些与其直接相邻的路由器的信息。

链路状态路由协议采用更加复杂的方式来确定网络的最佳路径。链接状态协议之所以得名，是因为每个路由器都会通告其每个接口的链接状态。这些接口可以连接到其他路由器，也可以直接连接到网络。有关每个路由器的信息将传播到自治系统上的每个其他路由器。这意味着系统上的每个路由器都知道有关系统中其他每个路由器的每个细节。

目前针对IPv4协议使用的是OSPF Version 2(RFC2328);针对IPv6协议使用OSPF Version 3(RFC2740)。如无特殊说明本章后续所指的OSPF均为OSPF Version 2。

运行OSPF路由器之间交互的是LS(Link State，链路状态)信息，而不是直接交互路由。LS信息是OSPF能够正常进行拓扑及路由计算的关键信息。

OSPF路由器将网络中的LS信息收集起来，存储在LSDB中。路由器都清楚区域内的网络拓扑结构，这有助于路由器计算无环路径。

每台OSPF路由器都采用SPF算法计算达到目的地的最短路径。路由器依据这些路径形成路由加载到路由表中。

OSPF支持VLSM(Variable Length Subnet Mask，可变长子网掩码)，支持手工路由汇总。

多区域的设计使得OSPF能够支持更大规模的网络。

如果 IP Datagram 的大小大于当前网络允许的 MTU(Maximum Transmission Unit) 时，则 IP Datagram 被首先分割成多个 Packet，然后在网络上传输，这个过程叫做 Fragmentation 。 Fragmentation 可以发生在初始的主机，或在路由过程中。

Fragmentation 可能会造成一个重传的出现，例如如果一个 Packet 的丢失，可能会导致多个 IP Datagrams 的重传。

IPv6 地址有两部组成: 网络前缀 和 接口 ID 。与 IPv4 不同的是，IPv6 具有一个标准的子网掩码 /64 ，用于几乎所有的普通地址。在此情况下，地址的一半是 网络前缀 ，另一半是 接口 ID 。这意味着单个子网可以>根据需要容纳任意数量的主机。

子网分配: 通常，网络提供商将为组织分配一个较短的前缀，如 /48 。这会保留其余网络部分以用于通过这一分配的前缀来指定子网。处理已分配的48位，将保留16位以用于子网（最多 65536 个子网）。同一子网上的任何两个子网接口都不能具有相同 接口 ID ， 接口 ID 可标识子网上的特定接口。

fd00::/8

唯一本地地址 (RFC 4193)

IPv6 没有 RFC 1918 专用地址空间的直接等效对象，尽管这很接近。站点可以使用这些以在组织中自助分配可路由的专用 IP 地址空间，但是这些网络不能在全局 Internet 上使用。站点必须随机从该空间中选择一个 /48，但是它可以正常将分配空间划分为 /64 网络

fe80::/64

本地链接地址

每个 IPv6 接口自动配置一个本地链接地址，该地址仅在该网络中的本地链接中有效。Link-local unicast addresses allow for local network segment communications and are configured based upon a host’s MAC address.

ff00::/8

等效于 IPv4 中的 224.0.0.0/4 ，多播用于同时传播到多个主机，并且在 IPv6 中特别重要，因为其没有广播地址。

Traffic Class - An 8-bit field that defines the type of traffic contained within the IP datagram and allows for different classes of traffic to receive different priorities.

Flow Labe - A 20-bit field that’s used in conjunction with the traffic class field for routers to make decisions about the quality of service level for a specific datagram.

Payload Length - A 16-bit field that defines how long the data payload section of the datagram is.

Next header - The next header field defines what kind of header is immediately after this current one.

Hop limit - An 8-bit field that’s identical in purpose to the TTL field in an IPv4 header.

Source address - 128 bits length

Destination address - 128 bits length

IPv6 and IPv4 Harmony

IPv6 tunnels - IPv6 tunnels are conceptually pretty simple. They consist of IPv6 tunnels servers on either end of a connection. These IPv6 tunnel servers take incoming IPv6 traffic and encapsulate it within traditional IPv4 datagrams.

IPv6 tunnel broker - Companies that provide IPv6 tunneling endpoints for you, so you don’t have to introduce additional equipment to your network.

Links:

发送端可能有多个进程需要发送数据，但是在任意一个时间只有一个传输协议，这种多对一的场景就需要多路复用发送，协议接受消息来自不同的进程，并且更加消息头上的端口号不同来区分，当完成消息头添加后，传输层可以将传输层包传递给网络层。

与发送测正好相反，接收测在接收到网络层的数据包后，面对的是一个一对多的场景，这就需要多路分解接收。在经过错误验证和去除消息头后，传输层会通过端口号将消息发送到不同的进程服务

Header Length - 长度为 4 个字节，也叫数据偏移字段，它定义了在一个 TCP Segment 中 TCP 头的长度，这也使接收端的网络设备知道真正数据负载开始的位置。

Control flags - TCP Segment 控制标签。

TCP window - 16 字节的数字，指定在需要确认前可能发送的序列号范围。

Checksum - 长度为 16 个字节，和 IP、Ethernet 中的 Checksum 字段类似，当接收者接收到这个 TCP Segment 后，Checksum 会进行一次计算，计算整个 TCP Segment 的长度，并和改字段定义的长度进行比较，以确保传输的过程中没有数据的丢失或损坏。

Urgent - 该字段通常与 TCP 控制标签中的某个标签联合使用，来说明某个 Segment 比其他 Segment 重要，或有特定含义。

Options - 该字段通常比较少用，但有时会用于更复杂的流控制协议。

Padding - 零序列，以确保数据有效负载部分从预期位置开始。

URG(urgent)

如果此标签值为 1，则表示当前 TCP Segment 特别重要，该标签通常与 TCP 头中的 Urgent 字段一起使用， Urgent 字段有更多信息。

ACK(acknowledge)

如果此标签值为 1，则表示 Acknowledgment number 字段应该被检查。

PSH(push)

传输设备想让接收端设备尽快将缓冲中的数据推送到应用。

RST(reset)

TCP 连接中的一方无法从一系列丢失或格式错误的段中正确恢复。

SYN(synchronize)

初次建立一个 TCP 连接时使用，让接收端知道需要检查 Sequence number 字段。

FIN(finish)

提示传输计算机端没有更多数据，连接可以关闭。

// Step One
A 发送一个 TCP Segment 到 B，主要包括一个 SYN 标签，告诉 B 客户端 A 的初始序列号为 J。（让我们开始建立连接吧，我的序列号为 J，这样我们会知道我们交流开始的位置）
A sends a TCP segment to B with SYN flag set (`Let's establish a connection and look at my sequence number field, so we know where this conversation starts.`)
// Step Two
B 回复一个 TCP Segment 到 A，包括两个标签 SYN 和 ACK，告诉 A 服务器端（B）的初始序列号为 K，同时确认 A 服务器（B）确认客户端 A 的序列号（ACK 的值为 J + 1）
B then responds with a TCP segment, where both the SYN and ACK flags are set(`Sure, let's establish a connection and I acknowledge your sequence number.`)
// Step Three
A 回复一个 TCP Segment 到 B，主要包括一个 ACK 标签，告诉服务器端 B 客户端 A 确认服务端的序列号（ACK 的值为 K + 1）。
A responds again with just the ACK flag set* (`I acknowledge your acknowledgement. Let's start sending data.`)

一次握手是两个设备确保他们所使用同一个协议，并且能够彼此相互理解。

TCP 连接是一个多路复用的模式，每一个 TCP Segment 的发送都会有一个 TCP Segment 的回复（ACK 标签），这样发送端就知道接收端接收到相应的片段。

TCP 建立一个连接需要三次 TCP Segment 交换，而终止一个连接需要四次 TCP Segment 交换，四次握手是对 TCP 连接终止的一个抽象。

TCP 连接可以从客户端和服务器端的任何一方发起，发起关闭连接的一次通常通过调运 close() 方法，我们将这一动作称为 主动关闭（Active Close） ，相应的另一端则称为 被动关闭（Passive Close） ，下图演示的是从服务器端 B 发起的主要关闭过程：

SYN_RECEIVED

前序处于 LISTEN 状态，接收到 SYN 标签的请求，并且给客户端回复了 SYN 和 ACK，但是连接还没有建立，等待客户端的 ACK 请求。这个状态只会在服务器端。

ESTABLISHED

TCP 连接建立后的状态，客户端和服务器可以自由相互发送数据，这个状态即可以是客户端，也可以是在服务器端。

FIN_WAIT

一个 FIN 标签的请求发送，同时没有接收到另一侧回复的 ACK。

CLOSE_WAIT

传输层 TCP 连接已经关闭, 但应用层还没有释放相应的套接字。

CLOSED

TCP 连接完全关闭，没有任何进一步通信的可能性。

系统端口和临时端口

传输层是根据端口号来确保端到端的通信，传输层的协议不管是 TCP 还是 UDP，都与端口号关联，端口号是一个 16 个字节长度的数字（范围为 0 - 65535）。端口号又分为系统端口和临时端口。

Table 25. IANA 定义的端口范围

1024 - 49151

已注册端口，这些端口不受 IANA 控制，不过由 IANA 登记，并提供他们使用情况清单，以方便整个群体。这些端口中一些通过可能被熟悉，3306 是 Mysql 的端口，8080 为 Tomcat/JBoss 端口。

49152 - 65535

这些端口被称为私有或临时端口，临时端口不能通过 INANA 注册，这些端口用在 TCP 连接的客户端随机选用（Source Port），一个客户端/服务器端通信的程序，服务器端通常监听与一个已注册的端口，客户端建立一个连接时会分配一个临时端口。

在 TCP/IP 体系结构中，HTTP 属于应用层协议，位于 TCP/IP 协议的顶层。因此，它在设计和使用中要以 TCP/IP 协议族中的其他协议为基础。例如，它要通过 DNS 进行域名和 IP 地址的转换，要建立 TCP 连接才能进行文档传输。

显然，HTTP 也是客户/服务器结构。这里，客户是浏览器(Browser),服务器是 Web 服务器。浏览 Web 时，浏览器通过 HTTP 协议于 Web 服务器交换信息。每当在 Web 上从一个资源转到另一个资源时，浏览器用 HTTP 访问 Web 服务器，其中就包括想要获得的资源信息。

浏览器和服务器通过 HTTP 交换 Web 文档时，实际可以交换不同的文档类型。这些文档类型的格式由多用途 Internet 邮件扩展 MIME(Mutipurpose Internet Mail Extensions)定义。MIME 是专门描述通过 Internet 传输多媒体数据的技术标准。

HTTP 支持客户(一般是浏览器)与服务器间的通信，相互传送数据。一个服务器可以为分布在世界各地的许多客户服务。HTTP 定义的事务处理由以下四步组成:

HTTP 与必须持续连接的 FTP 等不同，它是无状态的。也就是说，浏览器和服务器每进行一次 HTTP 操作，就建立一次连接，但随即又断开此次连接。访问 Web 站点时，浏览器与服务其之间建立连接，以便将服务器上的 HTML 文件下载到浏览器上。在 HTTP 1.0 版本中，浏览器收到文件后，即断开此次连接，如果浏览器发现还需要某些文件(例如下载图形)时，必须重新建立连接。而在 HTTP 1.1 版本中，可以采用一些机制使客户端和浏览器不断开最初建立的连接，而使用最初的连接请求后续的内容。

一次 HTTP 操作通常被称为一次事务(Transection)。HTTP 采用 TCP 连接，而且该连接仅在此次事务中保持，浏览器和服务器都不会记忆上次的连接状态。

HTTP 之所以采用这种无状态机制，完全是为了提高服务器的工作效率。在 Web 中点击一个超链接时，浏览器有可能从当前站点转到另一个站点。因此，无论何时单击超链接时，服务器都假定用户要退出浏览，因而断开连接。如果要继续浏览，就再次建立连接。如果用户确实要退出，服务期就不需要执行任务，因为连接已经断开。

当然 HTTP 的无状态也有缺点。由于没有状态，协议对事务处理没有记忆能力。如果后续事务处理需要前面处理的有关信息，那么这些信息必须在协议外面保存。缺少状态意味着所需要的前面信息必须重现，势必导致每次连接要传送较多的信息。在实际的应用中，状态的信息通常会采用客户端 Cookie 和服务器端的 Session ID 来配合保持用户的连接状态。

DHCP stands for Dynamic Host Configuration Protocol, which is an application layer protocol that automates the configuration process of hosts on a network. With DHCP, a machine can query a DHCP server when the computer connects to the network and receive all the network configuration in one go.

DHCP is an application layer protocol, which means it relies on the transport, network, data link and physical layers to operate.

The process by which a client configured to use DHCP attempts to get network configuration information is known as DHCP discovery.

DHCP Discovery - DHCP clients sends a DHCP discover message(DHCPDISCOVER) out onto the network, the DHCPDISCOVER message is encapsulated in a UDP datagram with a destination port of 67 and a source port of 68, this is then encapsulated inside of an IP datagram with a destination IP of 255.255.255.255, and a source IP of 0.0.0.0.

DHCP Offer - DHCP server examine its own configuration and make a decision on what, if any, IP address to offer to the client, the response would be sent as a DHCPOFFER message with a destination port of 68, a source port of 67, a destination broadcast IP of 255.255.255.255, and its actual IP as the source.

DHCP Request - DHCP client respond to the DHCPOFFER message with a DHCPREQUEST message, which essentially says, yes, I would like to have an IP that you offer to me. Since the IP hasn’t been assigned yet, this is again sent from an IP of 0.0.0.0 and to the broadcast IP of 255.255.255.255.

DHCP ACK - DHCP server receives the DHCPREQUEST message and responds with a DHCPACK or DHCP acknowledgement message, which is again sent to a broadcast IP of 255.255.255.255, and with a source IP corresponding to the actual IP of the DHCP server.

NAT(Network Address Translation)

字面上理解，NAT(Network Address Translation) 就是将一个 IP 地址翻译/转换成另一个 IP 地址。

NAT is a technology that allows a gateway usually a router or a firewall to rewrite the source IP of an outgoing IP datagram, while retaining the original IP in order to rewrite it into the response.

IP masquerading is an important security concept. The most basic concept at play here, is that no one can establish a connection to your computer if they don’t know what IP address it has. By using NAT in the way we’ve just described, we could actually have hundreds of computers on network A, all of their IPs being translated by the router to its own. To the outside world, the entire address space of network A is protected and invisible. This is known as one to many NAT, and you’ll see it in use on lots of LANs today.

Port preservation is a technique where the source port chosen by a client, is the same port used by the router.

Port forwarding is a technique where a specific destination ports can be configured to always be delivered to specific nodes.

LACNIC

covers Central and South America and any parts of the Caribbean not covered by ARIN.

serves Europe, Russia and the Middle East and portions of Central Asia.

Non-routable address space was defined in RFC1918 and consists of several different IP ranges that anyone can use.

And unlimited number of networks can use non-routable address space internally because internet routers won’t forward traffic to it. This means there’s never any global collision of IP addresses when people use those address spaces.

Non-routable address space is largely usable today because of technologies like NAT.

Virtual Private Networks or VPNs, are a technology that allows for the extension of a private or local network, to a host that might not work on that same local network.

VPNs are a tunneling protocol. Which means, they provision access to something not locally available.

VPNs, usually requires strict authentication procedures in order to ensure that they can only be connected to by computers and users authorized to do so. In fact, VPNs were one of the first technologies where two-factor authentication became common.

Two-factor authentication is a technique where more than just a username and password are required to authenticate. Usually, a short-lived numerical token is generated by the user through a specialized piece of hardware or software.

VPN 构建点对点连接

VPNs can also be used to establish site-to-site connectivity. It’s just that the router, or sometimes a specialized VPN device on one network, establishes the VPN tunnel to the router or VPN device on another network. This way, two physically separated offices might be able to act as one network and access network resources across the tunnel.

A proxy service is a server that actson behalf of a client in order to access another service. Proxies sit between clients and other servers, providing some additional benefit, anonymity, security, content filtering, increased performance, a couple other things.

Proxies doesn’t refer to any specific implementation. Proxies exist at almost every layer of our networking model.

Reverse proxy

A reverse proxy is a service that might appear to be a single server to external clients, but actually represents many servers living behind it.

现代 Web 应用架构使用 Reverse proxy：

The session layer is that it’s responsible for things like facilitating the communication between actual applications and the transport layer

The presentation layer is responsible for making sure that the unencapsulated application layer data is actually able to be understood by the application in question.

Network A contains address space 10.1.1.0/24, and computer A is part of Network A and has been assigned an IP address of 10.1.1.100

Network B contains address space 192.168.1.0/24, and computer B is part of Network B and has been assigned an IP address of 192.168.1.100, web server on computer B is running and listening on port 80

Router sits between network A and network B, with an interface configured with an IP of 10.1.1.1 on network A, and an interface at 192.168.1.1 on network B

An end user sitting at computer A opens up a web browser and enters 192.168.1.100 into the address bar to access the content in computer B’s web server

The web browser communicates with the local networking stack(part of OS), and know that it’s going to establish a TCP connection to 192.168.1.100, port 80 on another network.

Computer A looks at its ARP table to determine what MAC address of it’s gateway 10.1.1.1 is, but it doesn’t find any corresponding entry.

Router receives ARP message, check that currently it assigned the IP address of 10.1.1.1. So it responds to computer A to let it know about its own MAC address of 00:11:22:33:44:55.

Computer A receives this response and now knows the hardware address of its gateway, and ready to start constructing the outbound packet.

Application layer 's web browser trigger to open a socket, and get a ephemeral port 50000 from computer A OS

In the Transport layer , the networking stack starts to build a TCP segment , with appropriate fields in the header, including a source port of 50000, destination port of 80, sequence number field filled with a appropriate sequence number, the SYN flag is set, checksum for the segment is calculated and written to the checksum field.

The TCP segment passed along to the Network layer and start to encapsulate a IP Datagram , fill IP header with the source IP, the destination IP, a TTL of 64, fill the TCP segment as the data payload of the IP datagram , a checksum is calculated and put in checksum field.

The IP datagram passed alone to the Data link layer and start to construct a Ethernet frame , fill 00:11:22:33:44:55 as destination MAC addresses and computer A’s MAC addresses as source MAC addresses, insert IP datagram as he data payload of the Ethernet frame, enter a calculated checksum to reference field.

The Ethernet frame is ready to be sent across the physical layer , the network interface connected to computer A sends this binary data as modulations of the voltage of an electrical current running across a CAT6 cable that’s connected between it and a network switch.

This switch receives the frame and inspects the destination MAC address. The switch knows which of its interfaces this MAC address is attached to, and forwards the frame across only the cable connected to this interface.

Router receives the frame and recognizes its own hardware address as the destination. Router knows that this frame is intended for itself. So it now takes the entirety of the frame and performa checksum check against it. Router compares this checksum with the one in the Ethernet frame header and sees that they match.

Router strips away the Ethernet frame , leaving it with just the IP datagram . Again, it performs a checksum calculation against the entire datagram. And again, it finds that it matches. It then inspects the destination IP address and performs a lookup of this destination in its routing table , the look up results is that the router sees that the destination address 192.168.1.100 is on a locally connected network.

Continue in Router, the TTL be decrement, a new checksum be re-calculated, and creates a new IP datagram . Similar with Step 8, this new IP datagram is again encapsulated by a new Ethernet frame , which the source and destination MAC address of router and and computer B

The new Ethernet frame` is ready to be sent, and computer B receives the frame.

Computer B identifies its own MAC address as the destination, and knows that it’s intended for itself. computer B then strips away the Ethernet frame, leaving it with the IP datagram. It performs a checksum check and recognizes that the data has been delivered intact. It then examines the destination IP address and recognizes that as its own.

Computer B strips away the IP datagram, leaving it with just the TCP segment . Again, the checksum for this layer is examined, and everything checks out.

Computer B examines the destination port, which is TCP port 80. The networking stack on computer B checks to ensure that there’s an open socket on port 80, which there is. It’s in the listen state, and held open by a running Apache web server.

Computer B then sees that this packet has the SYN flag and knows that this is a TCP connection request.

Repeat the steps from 6 - 16, and form a TCP segment with flag SYN-ACK , and other field be filled correctly.

Computer A receives frame and awared the SYN-ACK flag and knows that the Computer B are ready to establish connection.

Repeat the steps from 6 - 16, and form a TCP segment with flag ACK , and other field be filled correctly.

Computer B receives frame and awared the ACK flag from computer A, knows that the Computer A are acknowledged. And finally finish the socket instantiation, and set the state to ESTABLISHED .