PCI Express as a Killer of Software-based Real-Time Ethernet PDF

PCI Express as a Killer of Software-based Real-Time Ethernet Rostislav Lisový, Michal Sojka, Zdeněk Hanzálek Czech Technical University in Prague,...

PCI Express as a Killer of Software-based Real-Time Ethernet Rostislav Lisový, Michal Sojka, Zdeněk Hanzálek Czech Technical University in Prague, Faculty of Electrical Engineering Technická 2, 121 35 Prague 6, Czech Republic Email: {lisovros,sojkam1,hanzalek}@fel.cvut.cz Abstract—The time-triggered Ethernet gains in popularity in bandwidth, because temporarily unused slots must be either many different industrial applications. While several hardware retained in the schedule or complex rules for their skipping implementations exist, software implementations are also very at- must be introduced. In addition, if the used technology exhibits tractive for their price-to-performance ratio. The main parameter transmission (TX) jitter1 , which is common with software- that influences the performance of time-triggered protocols is the based solutions, it is necessary to insert large inter-frame gaps transmission jitter, which is greater in software implementations. that decrease bandwidth utilization even more. Examples of In this paper we evaluate one source of transmission jitter TT protocols are TTEthernet , ProfiNet IRT or FlexRay occurring in such software implementations – the PCI Express. bus, which interconnects the CPU, memory and the network interface card in modern multi-core computers. We show that Some of the drawbacks of TT protocols are mitigated the contribution of the PCI Express to the transmission jitter by event-triggered protocols. There, the medium access is of Ethernet frames is significant and is in the same order of controlled by the reception of specific messages from other magnitude as the scheduling jitter of modern real-time operating nodes. For example Ethernet Powerlink has a managing systems. PCI Express latency and jitter are evaluated under node that controls it when so called controlled nodes can access various loads produced by virtual machines running on dedicated the medium. In Node Ordered Protocol , the medium CPU cores. We use the IEEE 1588 feature of the network card for precise timing measurements. access is determined by the predefined order of nodes. Another principle is used in Avionics Full-Duplex Switched Ethernet (AFDX) , which employs bandwidth limiting to ensure that I. I NTRODUCTION the network is not overloaded and latencies remain low. Ethernet-based networks are becoming more and more popular in industrial communication. This is because it is a For today’s industry, determinism of the network communi- historically well proven technology, it offers high bandwidth cation is necessary but not sufficient. The efficiency of resource and many real-time (RT) extensions that make the Ethernet usage is also important but it contradicts the demand for communication deterministic exist. Unfortunately, there is no determinism. Therefore, there are attempts to integrate multiple single universal real-time Ethernet extension. Several compa- subsystems of different criticality in a single platform to nies offer their proprietary solutions. Together with the high improve the efficiency. This contrasts to the federated principle price of those solutions, this might be the reason why software- applied so far, where every subsystem was implemented as a based real-time Ethernet implementations are so popular [2–6]. separate node. The examples of modern integrated architec- tures are IMA in avionics and AUTOSAR in the We investigate the possibility of implementing a software- automotive domain. One of the means for efficient integration based real-time Ethernet protocol while utilizing the exten- of subsystems is the use of virtualization. Here, hypervisors sive virtualization capabilities of modern x86 hardware. Our are used to provide strict separation of independent subsystems focus is on the commercial-off-the-shelf (COTS) networking allowing one to build a mixed-criticality system. and computing hardware, which is gaining in popularity for industrial automation, not only because its favorable price In the past, it was believed that the biggest source of and widespread availability but also because of the familiar TX jitter occurring in software implementations of the real- environment when used with any of the real-time Linux time Ethernet was the operating system’s (OS) scheduler. derivatives. With appropriate hardware and modern RT operating systems, the worst-case scheduling latencies are below 30 µs. One way of achieving deterministic medium access is to Nowadays, with the advent of multi-core CPUs, it is possible use time division multiple access method employed by the so to dedicate one or more cores for network processing and called time-triggered (TT) protocols [7, 8]. TT protocols need completely eliminate the non-determinism of the OS scheduler. to maintain a notion of global time in order to synchronize We have performed this by using a NOVA microhypervisor the transmission in all nodes. One way to achieve this is. We isolate one CPU from all unrelated activities such as to use Precision Time Protocol (PTP), standardized in IEEE timer interrupts. This is not yet possible with standard Linux 1588 , that allows one to synchronize the time with sub- and their virtualization platforms such as KVM. Moreover, the microsecond precision over the Ethernet. The advantages of TT networks are determinism and trivial evaluation of the worst- 1 Transmission jitter is the difference between maximum and minimum case behavior. A disadvantage is inefficient use of available deviation from the intended transmission time. minimalist design of NOVA, together with a small memory measurement was based on a PCI Express bus, it is question- footprint of our network processing code, ensures that all the able whether the precision was really so low. We show that a network processing code and related data can be fetched from PCI Express can introduce jitter over 10 µs. the CPU’s private level-2 cache memory without interference Cena, Cereia, Bertolotti, et al. describe a software caused by memory traffic from other cores. Additionally, the implementation of the IEEE 1588 time synchronization pro- isolation and fault-containment properties of the NOVA system tocol based on RTnet. The accuracy of their implementation make it suitable for use in safety-critical environments, which is assessed by generating a signal on a parallel port of a would be impossible for systems such as Linux or RTAI, PC and measuring the properties of that signal. Since the where a huge amount of code (Linux kernel) runs in the most parallel port is connected over a slow LPC bus as detailed in privileged CPU mode. Section III-A, the jitter of the parallel port’s generated signal We believed that our implementation outlined in the pre- is also influenced by the PCI Express jitter, which can be quite vious paragraph would provide very good performance, espe- high. cially in terms of TX jitter figures. To our surprise, the real Pure software implementation of the OpenPOWERLINK, jitter was greater than 10 µs, which was comparable to other open source Industrial Ethernet solution, are described in. Linux-based solutions found in the literature. Therefore, we Safety-certifiable software implementation of the AFDX stack decided to investigate the cause of it. and the achieved latencies are analyzed in. None of those The contributions of this paper are: We evaluate the prop- papers give sufficient details on the CPU-NIC interconnect. erties of the PCI Express bus (PCIe), which interconnects the CPU, memory and the network interface card (NIC). We show III. A RCHITECTURE that the contribution of the PCIe to the TX jitter of Ethernet A. Today’s PC architecture frames is significant. PCIe latency and jitter are evaluated under various loads produced by virtual machines running The architecture of the modern PC and of many industrial on other CPU cores. We use the IEEE 1588 feature of the computers is determined by the architecture of the PCI Express NIC for precise timing measurements. Our findings are useful (PCIe) bus. The central component called a root complex for all SW-based real-time protocols implemented on modern connects the CPU cores with memory controllers and other x86 hardware. For time-triggered networks our results can be peripherals (Fig. 1). It is usually integrated on the same chip as used to determine the proper size of inter-frame gaps. For the CPU. The other components of the PCIe topology are end- event-triggered networks, the TX jitter influences the timing points, which usually represent devices, and switches, which precision, which might be an important parameter for many are responsible for interconnecting all of the endpoints with applications. the root complex. All those components are interconnected via PCIe links that are formed by one or more lanes. The The paper is structured as follows: After reviewing the more lanes the higher bandwidth of the link. N -lane link is related work in Section II, we describe the architecture of denoted as xN. All PCIe communication is packet-based and modern computers and of our hardware and software used for packets are routed from the root complex via switches to the measurements in Section III. The results of our experiments destination endpoints and back. Since one link can transfer are presented in Section IV and we conclude with Section V. only one packet in one direction at a time, packets may be delayed by waiting for a free link. II. R ELATED WORK Root complex typically has several PCIe links. In PCs, Many software implementations of real-time Ethernet exist. one is dedicated to a graphics adapter, another is connected Probably, the most well known is RTnet. It is a generic to a so called platform controller hub (or chipset in networking stack for RTAI and Xenomai – real-time extensions short). It contains PCIe switch(es) interconnecting different of Linux. As RTnet is implemented as a kernel module sharing PCIe endpoints and conceivably a bridge to the legacy PCI bus. an address space with the Linux kernel, it is not well suited PCH also integrates other controllers such as USB, SATA, LPC for safety-critical applications. (low pin count interface – used to connect legacy peripherals such as a parallel port or a PS/2 keyboard). Those additional Grillinger, Ademaj, Steinhammer, et al. describe soft- controllers appear to the operating system as PCI devices. ware implementation of the Time-Triggered Ethernet (TTE) Besides PCI devices, PCH also contains non-PCI devices such implemented in RTAI. The authors evaluated the achieved as high-precision event timers (HPET). latencies and jitters and found them in the order of tenths of microseconds. They claim that the main bottleneck of their Due to the packet-based character of the PCIe communica- implementation is the interrupt latency that influences the tion, sharing of PCI links between devices and several sources precision of software packet timestamping and that hardware of latency in the PCIe communication protocol (e.g. the time stamping would help. In this paper, we show that despite need for acknowledging received packets), the total latency of the fact that hardware timestamping is used, the PCI Express PCIe communication can be relatively high compared to an causes significant jitter. older parallel PCI bus. Bartols, Steinbach, Korf, et al. analyzed the latencies B. Intel 82576 Gigabit Ethernet Controller of the TTE hardware by using a Linux kernel with rt preempt patches. They implemented software-based timestamping of In our experiments, presented in Section IV, we used a the packets and report that the precision of their measurements modern network interface card (NIC) based on Intel’s 82576 is in units of microseconds. Since the system used for their Gigabit Ethernet controller. The main reason we chose this CPU VM 4 PCIe Graphics VMM Core 0 Core 1 Core 2 Core 3 Adapter Slot User mode (GFX) RTEth VM 1 VM 2 VM 3 + NIC PCIe VMM VMM VMM driver x16 Root Complex RAM DDR3 Kernel mode NOVA microhypervisor Hardware CPU 0 CPU 1 CPU 2 CPU 3 DMI / PCIe PCH Figure 2. Software architecture of our implementation HPET PCIe typically x4 PCIe PCIe Slots Switch SATA mode (kernel mode). Its very small trusted computing base (9 (IO) PCIe kLoC) together with the virtualization support makes it a very USB Switch interesting solution for safety-critical applications. Note that Serial / Parallel in NOVA, device drivers are not the part of the kernel. PCIe to PCI PCI port; Keyboard, LPC Bridge Slots NOVA can execute applications in two modes: native mode Mouse PS/2 and virtual machines (VM). Virtual machine monitor (VMM) Figure 1. Typical architecture of a modern PC is a NOVA application running in the user mode comprising the native part that emulates the PC platform and the VM part that executes the VM code. COTS NIC was the built-in hardware support for the IEEE The code implementing the real-time Ethernet functionality 1588 standard. This support was used for precise measure- is placed in RTEth application running in the user mode. It ments of the PCIe latencies in this paper. The NIC contains is a native NOVA application and besides other things, it two Ethernet controllers but in our experiments we use only contains the device driver for the NIC. The application is one of them. responsible for managing the transmission and reception of The key features supporting the implementation of PTP on the Ethernet frames according to the predefined schedule. It this device are an adjustable clock and hardware timestamping. is important to note, that the application does not touch (i.e. copy) the data to be transmitted or received. The data to be The adjustable clock is implemented as a 64-bit up counter. transmitted is stored in the main system memory directly by The readout is possible through two 32-bit registers (the higher the application that produces it (e.g. a virtual machine). This half of the value is latched when the lower half is read). The application only notifies the RTEth application that the data is clock is periodically incremented. Both, the period and the ready in the memory and RTEth instructs the NIC to transmit increment are configurable. The increment period can be set it. A similar principle is applied for packet reception. This is as a multiple of 16 ns. possible because of the use of the shared memory between The hardware timestamping feature allows one to capture the RTEth application and its clients. The implementation is timestamps (i.e. the value of clock described above) of the simplified by the use of IOMMU. received PTP packets and of arbitrary transmitted packets. The RTEth application itself is pinned to one CPU core, Only one RX and TX timestamp may be stored at the same which is reserved solely for it. The size of application’s code time in dedicated pairs of 32-bit wide registers. The hardware and data is 40 KiB, which means it fits into the CPU’s 256 KiB responsible for timestamping is as close as possible to the of L2 cache. Note that the kernel code used for inter-process PHY circuit, which performs the conversion between logical communication and scheduling is less than 2 KiB in size and signals and the physical layer. This ensures a high precision it fits into the cache together with the application. This means of the captured timestamps, which are taken immediately after that the application does not suffer from interference caused by transmitting/receiving the Ethernet Start of frame delimiter. memory traffic from other cores in the system. This reduces the execution time jitter of the application and makes its execution C. Software architecture more predictable. Our longer-term goal is to build a software-based time- D. Testbed setup triggered Ethernet stack on COTS x86 computers with a NOVA microhypervisor. While this stack is not yet implemented, The computer used for the evaluation was a common PC we outline its planned software architecture in this section, computer equipped with an Intel i5-3550 CPU (IvyBridge, 4 because it is the same as in our experimental setup for this cores, no hyper-threading), 4 GiB of RAM and a network paper. add-on card with an Intel 82576 GbE controller (NIC in the following). The NIC is equipped with an x4 PCIe connector. The software architecture is depicted in Figure 2. The lowest level consists of a NOVA microhypervisor. It is The computer comprises two PCIe slots. One of them is responsible for hardware resource management and scheduling an x16 slot connected directly to the root complex inside the and it is the only component that runs in privileged processor CPU, the other, an x4 slot, is connected to the chipset (PCH). +-00.0 Intel Corp. Ivy Bridge DRAM Controller Number of measurements +-01.0---+-00.0 Intel Corp. 82576 Gigabit Network Connection 106 IO PCIe slot | \-00.1 Intel Corp. 82576 Gigabit Network Connection +-02.0 Intel Corp. Ivy Bridge Graphics Controller 105 GFX PCIe slot +-14.0 Intel Corp. Panther Point USB xHCI Host Controller 4 10 +-16.0 Intel Corp. Panther Point MEI Controller #1 3 +-19.0 Intel Corp. 82579LM Gigabit Network Connection 10 +-1a.0 Intel Corp. Panther Point USB Enhanced Host Controller #2 2 +-1b.0 Intel Corp. Panther Point High Definition Audio Controller 10 +-1d.0 Intel Corp. Panther Point USB Enhanced Host Controller #1 101 +-1e.0--- +-1f.0 Intel Corp. Panther Point LPC Controller 100 +-1f.2 Intel Corp. Panther Point 6 port SATA AHCI Controller 0 1 2 3 4 5 6 7 \-1f.3 Intel Corp. Panther Point SMBus Controller Latency (microseconds) Figure 3. Logical PCIe topology as shown by lspci -tv command Figure 4. Latency profile of the NIC clock register readout (for NIC in two different PCIe slots). System with no load. Number of measurements For the purpose of this paper we call the former the GFX slot 106 IO PCIe slot and the latter the IO slot. 105 GFX PCIe slot Besides experimenting with the NIC, we also utilized 104 the SATA controller to generate interfering PCIe traffic. We 103 connected a common rotating hard drive with a SATA 3.0 102 interface to one of the on-board SATA ports. 1 10 We tried to extract the physical PCIe topology of our 100 system, but it does not provide the relevant PCIe capabilities to 0 1 2 3 4 5 6 7 do that. The logical PCI topology presented to the operating Latency (microseconds) system is flattened and does not correspond to the physical topology. Nevertheless, Figure 3 shows the output of the Figure 5. Latency profile of the NIC clock register readout (for NIC in two lspci tool. In this case, the NIC was connected to the GFX different PCIe slots). System with CPU load. slot, which is denoted as PCI bus number 1 ( in the figure). When the NIC was connected to the IO slot, the way, the disk traffic consumes the maximum possible corresponding entries appeared on bus 2 ( in the figure). PCIe bandwidth. In our measurements, we did not identify any anomalies Combined load: A combination of disk and CPU load that could be caused by System Management Interrupts, so – one VM with disk load and two VMs with CPU we did not attempt to eliminate or mitigate them. load. Disk + serial load: The same as disk load but the IV. E VALUATION output of the dd command (about 100 characters) was This section presents the results of our measurements of the sent to the serial line. PCI Express latencies. We measured two types of latencies in our experiments: the latency of the NIC clock register readout Besides changing the load, we also changed the PCIe slot and the hardware TX latency. The experiments are described where the NIC was plugged in during the experiment (the GFX in more details in the following subsections. and IO PCIe slot). We present the results of some experiments in the form All measurements were performed under several different of a latency profile. This is a cumulative histogram of the loads of the system: measured values with a reversed vertical axis in log scale. The No load: No intentional load was placed on the PCIe advantage over a plain cumulative histogram is that the worst- or CPU. case latencies are magnified by the log scale (see for instance lower right hand corner of Fig. 4). CPU load: Three Linux 3.6 VMs running on dedi- cated CPUs (VM1-3 in Fig. 2) run a Sysbench CPU A. Latency of NIC clock register readout benchmark2. In this experiment, we measured the time spent by reading Disk load: One Linux 3.6 VM with direct ac- the clock register located in the NIC. The resulting latency cess to the SATA controller (connected to the PCIe was calculated as the difference between the values obtained bus) was run on a dedicated CPU. This VM was from two consecutive reads of the whole 64-bit clock register executing a dd if=/dev/sda of=/dev/null (tclk2 −tclk1 in Fig. 6). Despite the fact we do not exactly know bs=8M count=1 iflag=direct command in how big a fraction of the total time was spent in the NIC’s an infinite loop. The size of the block (8 MB) was internal logic and what was caused by the PCIe latencies, we chosen on purpose to fit into the on-disk cache. This believe that the measured time represents the lower bound of the communication latencies between the CPU and the NIC. 2 Available from http://sysbench.sourceforge.net/; the command was sysbench --test=cpu --cpu-max-prime=999999999 run Figure 4 shows the values measured for the no load --num-threads=1 scenario whereas Figure 5 contains values measured with CPU Frame processing by the NIC tclk1 tclk2 NIC tTX PCIe commu- nication RAM CPU NIC clock NIC clock Time readout readout Figure 6. Explanation of the measured latencies load. It can be seen that there are significant differences in the latency and jitter figures between the PCIe slots. We summarize the measured values in the table below: Avg. latency Jitter Figure 7. NIC clock readout latency in disk + serial load scenario (NIC was Slot No load CPU load No load CPU load in the IO slot) GFX 1.38 µs 1.41 µs 5.31 µs 5.76 µs IO 3.11 µs 3.12 µs 1.87 µs 2.21 µs (tT X − tclk2 in Fig. 6). The latencies for the GFX slot and IO Figure 7 shows the results of the disk + serial load scenario slot ranged from 5 µs to 14 µs and from 8.5 µs to 19.5 µs, for the NIC in the IO slot. There are periodic spikes of respectively. increased latency. Although we first thought that the spikes are caused by disk transfers, it turned out that they are brought The distribution of latencies in time is shown in Figure 10. about by communication over the serial port that we used as In the depicted experiment, the periods of no load and disk a console for the VM. In NOVA, the serial driver uses polling load scenarios were interleaved with a period approximately to wait for an empty TX buffer register and this results in a equal to 60 seconds. It can be seen that the distribution of the high PCIe bus load. In production systems, polling is avoided increased latencies in time is uniform. whenever possible but sometimes device drivers have to use polling to work around hardware bugs. The precision of our measurement method is influenced by the following factors: The end of the measured interval A careful reader can also identify a small increase in is captured with very high precision (hardware timestamp), latencies (cca. 0.5 µs) with a 40 ms period in Figure 7. This but the start of the interval (NIC clock readout) suffers from was caused by updating the text screen in the VGA video RAM an error in the range of several microseconds as shown in of the integrated graphics adapter. If the VGA is configured in Section IV-A. If we wanted to decrease the error, we would NOVA, the screen of the foreground application gets copied need another clock synchronised with an NIC clock having a to the VGA memory 25 times per second. A similar increase negligible readout time. of latencies can also be seen in Fig. 4. If the external graphics adapter and/or fully graphical mode was used, the latencies It is interesting to see that even a sole CPU load on could be much worse. unrelated CPUs caused big increases in latencies. The reason is that the CPU load makes the Linux scheduler to be invoked frequently. This resulted in about 1500 timer interrupts per B. Hardware NIC TX latency second per VM. As NOVA uses HPET timers as a backend In this experiment, we measured the time needed by the for all virtual timers, the communication between the CPU and NIC to start the transmission of a frame. More precisely, we HPET, located in the chipset, has an influence on the PCIe bus measured the time between the moment when the NIC got the and, therefore, also on the NIC latencies. information about a new frame to send (setting the NIC TX The worst latencies were achieved for the disk + serial load descriptor register to point to the ready packet descriptor) and although the sole disk load exhibits a low average latency. As the timestamp captured by the NIC while the frame was being mentioned above, this is caused by polling in the serial port transmitted. During the transmission the NIC autonomously driver. In summary, the jitter of the PCIe latency is similar for fetches the frame payload from the RAM (via PCIe). both slots and is approximately 10 µs. The results presented in this section are valid for 166 byte long frames. When we increased the frame length to 1 KiB, V. C ONCLUSION the latency increased by 1.5 µs in both the GFX and IO slots. With hardware support for IEEE 1588, it is possible Figures 8 and 9 show the latency profiles of the TX to synchronize NIC clocks with sub-microsecond precision. latencies under different loads in the GFX and IO slots, However, if one wants to schedule the Ethernet traffic in the respectively. The latencies were calculated as a difference software, as many popular real-time Ethernet software stacks between the TX timestamp and the value read from the NIC do, the achieved frame transmission precision is much worse. clock register just before setting the NIC TX descriptor register It is believed that the main cause of the transmission jitter is the 106 Disk load + Serial Number of measurements 5 Combined load 10 Disk load CPU load No load 104 103 102 101 100 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Latency (microseconds) Figure 8. Latency profiles of the frame transmission on an Intel 82576 GbE controller for different PCIe and system loads (an NIC plugged into the GFX PCIe slot) 106 Disk load + Serial Number of measurements Combined load 105 Disk load CPU load No load 104 103 102 101 100 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 Latency (microseconds) Figure 9. Latency profiles of the frame transmission on an Intel 82576 GbE controller for different PCIe and system loads (an NIC plugged into the IO PCIe slot) scheduler of the operating system. In this paper, we identified another often neglected source of the transmission jitter, which is the PCI Express bus. We measured its contribution to the overall transmission jitter and found it to be around 10 µs. This value is in the same order of magnitude as the scheduling jitter of modern real-time operating systems. As for our future work, we will look at improving the PCI Express induced jitter by using PCI Express QoS features such as isochronous virtual channels mentioned in. We could not use them in this work, because our NIC does not support them. We plan to use the NetFPGA platform to experiment with those. ACKNOWLEDGMENT The authors would like to thank Zoltan Fodor from Intel for providing the network cards for experiments. Figure 10. No load and disk load scenarios. Latencies calculated as tT X − The research leading to these results received funding from tclk2. the ARTEMIS Joint Undertaking under the grant agreement no 295354 and from the Grant Agency of the Czech Republic under the Project GACR P103/12/1994. R EFERENCES ’07. IEEE/AIAA 26th, 2007, DOI: 10.1109/DASC.2007. 4391842. M. Felser, “Real time ethernet: standardization and AUTOSAR, Specification of operating system, R4.0 rev implementations”, in Industrial Electronics (ISIE), 2010 3, Nov. 2011. [Online]. Available: http://www.autosar. IEEE International Symposium on, 2010, pp. 3766– org/download/R4.0/AUTOSAR SWS OS.pdf. 3771. DOI: 10.1109/ISIE.2010.5637988. C. Baumann, T. Bormer, H. Blasum, and S. Tverdy- J. Kiszka, B. Wagner, Y. Zhang, and J. Broenink, “RTnet shev, “Proving memory separation in a microkernel by – a flexible hard real-time networking framework”, in code level verification”, in Object/Component/Service- Emerging Technologies and Factory Automation, 2005. Oriented Real-Time Distributed Computing Workshops ETFA 2005. 10th IEEE Conference on, (Catania, Italy), (ISORCW), 2011 14th IEEE International Symposium Sep. 12–22, 2005. DOI: 10.1109/ETFA.2005.1612559. on, 2011, pp. 25–32. DOI: 10.1109/ISORCW.2011.14. P. Grillinger, A. Ademaj, K. Steinhammer, and H. C. Emde. (Oct. 2010). Long-term monitoring of ap- Kopetz, “Software implementation of a time-triggered parent latency in preempt rt Linux real-time systems, ethernet controller”, in Factory Communication Systems, OSADL, [Online]. Available: https : / / www. osadl. org / 2006 IEEE International Workshop on, IEEE, pp. 145– fileadmin/dam/articles/Long- term- latency- monitoring. 150. DOI: 10.1109/WFCS.2006.1704143. pdf (visited on 04/2013). I. Khazali, M. Boulais, and P. Cole, “AFDX soft- U. Steinberg and B. Kauer, “NOVA: a microhypervisor- ware network stack implementation—practical lessons based secure virtualization architecture”, in Proceedings learned”, in Digital Avionics Systems Conference, 2009. of the 5th European conference on Computer systems, DASC’09. IEEE/AIAA 28th, IEEE, 2009, 1–B. DOI: 10. ser. EuroSys ’10, Paris, France: ACM, 2010, pp. 209– 1109/DASC.2009.5347574. 222, ISBN: 978-1-60558-577-2. DOI: 10.1145/1755913. J. Baumgartner and S. Schoenegger, “POWERLINK and 1755935. real-time Linux: A perfect match for highest perfor- J. Kiszka. (Apr. 3, 2013). RTnet, [Online]. Available: mance in real applications”, in Twelfth Real-Time Linux http://www.rtnet.org/. Workshop, Nairobi, Kenya, 2010. F. Bartols, T. Steinbach, F. Korf, and T. C. Schmidt, J. Loeser and H. Haertig, “Low-latency hard real-time “Performance analysis of time-triggered ether- communication over switched Ethernet”, in Real-Time networks using off-the-shelf-components”, in Systems, 2004. ECRTS 2004. Proceedings. 16th Euromi- Object/Component/Service-Oriented Real-Time cro Conference on, 2004, pp. 13–22. DOI: 10. 1109 / Distributed Computing Workshops (ISORCW), 2011 EMRTS.2004.1310992. 14th IEEE International Symposium on, IEEE, 2011, A. Ademaj and H. Kopetz, “Time-triggered ethernet pp. 49–56. DOI: 10.1109/ISORCW.2011.16. and IEEE 1588 clock synchronization”, in Precision G. Cena, M. Cereia, I. Bertolotti, S. Scanzio, A. Valen- Clock Synchronization for Measurement, Control and zano, and C. Zunino, “A software implementation of Communication, 2007. ISPCS 2007. IEEE International IEEE 1588 on RTAI/RTnet platforms”, in Emerging Symposium on, IEEE, 2007, pp. 41–43. DOI: 10.1109/ Technologies and Factory Automation (ETFA), 2010 ISPCS.2007.4383771. IEEE Conference on, 2010, pp. 1–8. DOI: 10. 1109 / H. Kopetz, A. Ademaj, P. Grillinger, and K. Stein- ETFA.2010.5640955. hammer, “The time-triggered ethernet (TTE) design”, PCI Special Interest Group, PCI Express Base Specifi- in Object-Oriented Real-Time Distributed Computing, cation, Revision 2.1. PCI-SIG, 2009. 2005. ISORC 2005. Eighth IEEE International Sympo- Intel, Intel R 7 series / C216 chipset family platform sium on, IEEE, 2005, pp. 22–33. DOI: 10.1109/ISORC. controller hub (PCH) datasheet, 326776-003, Jun. 2012. 2005.56. [Online]. Available: http://www.intel.com/content/dam/ J. C. Eidson, Measurement, Control, and Commu- www / public / us / en / documents / datasheets / 7 - series - nication Using IEEE 1588, 1st. Springer Publish- chipset-pch-datasheet.pdf (visited on 04/2013). ing Company, Incorporated, 2010, ISBN: 184996565X, K. Yogendhar, V. Thyagarajan, and S. Swaminathan. 9781849965651. (May 21, 2007). Realizing the performance potential Z. Hanzalek, P. Burget, and P. Sucha, “Profinet io irt of a PCI-Express IP, [Online]. Available: http://www. message scheduling with temporal constraints”, Indus- design - reuse. com / articles / 15900 / realizing - the - trial Informatics, IEEE Transactions on, vol. 6, no. 3, performance-potential-of-a-pci-express-ip.html. pp. 369–380, 2010, ISSN: 1551-3203. DOI: 10.1109/TII. Freescale Semiconductor, MPC5200 (L25R) errata, 2010.2052819. Rev. 5, ATA interrupt is not affected by FIFO errors, FlexRay Consortium et al., “Flexray communications Dec. 2011, ch. 2.1. [Online]. Available: http : / / www. system”, Protocol Specification Version, vol. 2, 2005. freescale. com / files / 32bit / doc / errata / MPC5200E. pdf L. Chanjuan, N. McGuire, and Z. Qingguo, “A new real- (visited on 04/2013). time network protocol-node order protocol”, in Proceed- M. Cereia, I. Bertolotti, and S. Scanzio, “Performance ings of eleventh real-time Linux workshop, Open-Source of a real-time ethercat master under linux”, Indus- Automation Development Labs, 2009, pp. 105 –109. trial Informatics, IEEE Transactions on, vol. 7, no. 4, C. Watkins and R. Walter, “Transitioning from federated pp. 679–687, 2011, ISSN: 1551-3203. DOI: 10.1109/TII. avionics architectures to Integrated Modular Avionics”, 2011.2166777. in Digital Avionics Systems Conference, 2007. DASC NetFPGA. (2013). NetFPGA, [Online]. Available: http: //netfpga.org/ (visited on 04/2013).

PCI Express as a Killer of Software-based Real-Time Ethernet PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue