Linux网络协议栈
Linux内核提供了抽象的网络通信协议栈,用户空间的应用通过系统调用的方式,使用内核协议栈能力,完成网络通信。 Linux内核提供的协议栈接口是socket,socket在Linux中归属于文件系统的一部分,因此网络通信可以被看做是对文件的读写,使得应用程序控制网络与控制文件一样方便。 Internet上有很多文章介绍Linux网络协议栈,这里的内容也是各种文章汇总,感谢总结与分享的人。
segmentfault上的系列文章
- 数据包的接收过程
- 数据包的发送过程
51CTO 世民谈云计算系列文章
- Linux 网络协议栈简单总结
- 非虚拟化Linux环境中的网络分段卸载技术 GSO/TSO/UFO/LRO/GRO
- QEMU/KVM + VxLAN 环境下的 Segmentation Offloading 技术(发送端)
伊利诺伊大学课程
- Linux Kernel Networking
新罕布什尔大学文章
- Linux IP Networking-A Guide to the Implementation and Modification of the Linux Protocol Stack
无名氏
- Anatomy of the Linux networking stack
- The linux networking architecture
Linux网络路径
一个数据包接收的流程为:网卡->内存->CPU->网卡驱动->内存->内核协议栈->用户应用程序。
一个数据包发送的流程为:用户应用程序->内核协议栈->网卡驱动->网卡数据包接收处理过程
网卡到内存(硬件)
+-----+
| | Memroy
+--------+ 1 | | 2 DMA +--------+--------+--------+--------+
| Packet |-------->| NIC |------------>| Packet | Packet | Packet | ...... |
+--------+ | | +--------+--------+--------+--------+
| |<--------+
+-----+ |
| +---------------+
| |
3 | Raise IRQ | Disable IRQ
| 5 |
| |
↓ |
+-----+ +------------+
| | Run IRQ handler | |
| CPU |------------------>| NIC Driver |
| | 4 | |
+-----+ +------------+
|
6 | Raise soft IRQ
|
↓
内核的网络模块(网卡驱动)
+-----+
17 | |
+----------->| NIC |
| | |
|Enable IRQ +-----+
|
|
+------------+ Memroy
| | Read +--------+--------+--------+--------+
+--------------->| NIC Driver |<--------------------- | Packet | Packet | Packet | ...... |
| | | 9 +--------+--------+--------+--------+
| +------------+
| | | skb
Poll | 8 Raise softIRQ | 6 +-----------------+
| | 10 |
| ↓ ↓
+---------------+ Call +-----------+ +------------------+ +--------------------+ 12 +---------------------+
| net_rx_action |<-------| ksoftirqd | | napi_gro_receive |------->| enqueue_to_backlog |----->| CPU input_pkt_queue |
+---------------+ 7 +-----------+ +------------------+ 11 +--------------------+ +---------------------+
| | 13
14 | + - - - - - - - - - - - - - - - - - - - - - - +
↓ ↓
+--------------------------+ 15 +------------------------+
| __netif_receive_skb_core |----------->| packet taps(AF_PACKET) |
+--------------------------+ +------------------------+
|
| 16
↓
+-----------------+
| protocol layers |
+-----------------+
协议栈(传输IP层)
|
|
↓ promiscuous mode &&
+--------+ PACKET_OTHERHOST (set by driver) +-----------------+
| ip_rcv |-------------------------------------->| drop this packet|
+--------+ +-----------------+
|
|
↓
+---------------------+
| NF_INET_PRE_ROUTING |
+---------------------+
|
|
↓
+---------+
| | enabled ip forword +------------+ +----------------+
| routing |-------------------->| ip_forward |------->| NF_INET_FORWARD |
| | +------------+ +----------------+
+---------+ |
| |
| destination IP is local ↓
↓ +---------------+
+------------------+ | dst_output_sk |
| ip_local_deliver | +---------------+
+------------------+
|
|
↓
+------------------+
| NF_INET_LOCAL_IN |
+------------------+
|
|
↓
+-----------+
| UDP layer |
+-----------+
协议栈(控制UDP层)
|
|
↓ promiscuous mode &&
+--------+ PACKET_OTHERHOST (set by driver) +-----------------+
| ip_rcv |-------------------------------------->| drop this packet|
+--------+ +-----------------+
|
|
↓
+---------------------+
| NF_INET_PRE_ROUTING |
+---------------------+
|
|
↓
+---------+
| | enabled ip forword +------------+ +----------------+
| routing |-------------------->| ip_forward |------->| NF_INET_FOWARD |
| | +------------+ +----------------+
+---------+ |
| |
| destination IP is local ↓
↓ +---------------+
+------------------+ | dst_output_sk |
| ip_local_deliver | +---------------+
+------------------+
|
|
↓
+------------------+
| NF_INET_LOCAL_IN |
+------------------+
|
|
↓
+-----------+
| UDP layer |
+-----------+
数据包发送处理过程
Socket层
+-------------+
| Application |
+-------------+
|
|
↓
+------------------------------------------+
| socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP) |
+------------------------------------------+
|
|
↓
+-------------------+
| sendto(sock, ...) |
+-------------------+
|
|
↓
+--------------+
| inet_sendmsg |
+--------------+
|
|
↓
+---------------+
| inet_autobind |
+---------------+
|
|
↓
+-----------+
| UDP layer |
+-----------+
UDP层
|
|
↓
+-------------+
| udp_sendmsg |
+-------------+
|
|
↓
+----------------------+
| ip_route_output_flow |
+----------------------+
|
|
↓
+-------------+
| ip_make_skb |
+-------------+
|
|
↓
+------------------------+
| udp_send_skb(skb, fl4) |
+------------------------+
|
|
↓
+----------+
| IP layer |
+----------+
IP层
|
|
↓
+-------------+
| ip_send_skb |
+-------------+
|
|
↓
+-------------------+ +-------------------+ +---------------+
| __ip_local_out_sk |------>| NF_INET_LOCAL_OUT |------>| dst_output_sk |
+-------------------+ +-------------------+ +---------------+
|
|
↓
+------------------+ +----------------------+ +-----------+
| ip_finish_output |<-------| NF_INET_POST_ROUTING |<------| ip_output |
+------------------+ +----------------------+ +-----------+
|
|
↓
+-------------------+ +------------------+ +----------------------+
| ip_finish_output2 |----->| dst_neigh_output |------>| neigh_resolve_output |
+-------------------+ +------------------+ +----------------------+
|
|
↓
+----------------+
| dev_queue_xmit |
+----------------+
netdevice子系统
|
|
↓
+----------------+
+----------------| dev_queue_xmit |
| +----------------+
| |
| |
| ↓
| +-----------------+
| | Traffic Control |
| +-----------------+
| loopback |
| or +--------------------------------------------------------------+
| IP tunnels ↓ |
| ↓ |
| +---------------------+ Failed +----------------------+ +---------------+
+----------->| dev_hard_start_xmit |---------->| raise NET_TX_SOFTIRQ |- - - - >| net_tx_action |
+---------------------+ +----------------------+ +---------------+
|
+----------------------------------+
| |
↓ ↓
+----------------+ +------------------------+
| ndo_start_xmit | | packet taps(AF_PACKET) |
+----------------+ +------------------------+