嵌入式系统和安全
嵌入式系统和安全
Introduction
Four requirements for embedded system
- Efficiency, function,dependability, security
Relationship between Dependability and security
Difference between security and safety
Goal of lecture
- Being able to design secure embedded systems
- Assess and choose appropriate measures to secure an embedded system
- Implement given tasks on an embedded system (done)
- Use toolchains for cross-platform development
- Discuss memory organization
- Classify different types of on-chip memory
- Recall the boot process of a uC
- Describe and use memory mapped I/O
- Compare and use methods for embedded debugging
- Explain and use interrupts
- List common peripherals and explain their purpose
Microcontroller Basic
Components of an embedded system
Describe every component in a few words. (What is it? What it used for)
CPU
piplined processor
popular architecture
- arm cotex-M
- intel x86
symmetric, asymetric multicore
Cache
Fast memory to increase performance and decrease power
used for both memory(Instruction and data, I-Cache and D-Cache), or unified
Impact on predictability and real time behavior critical
Scratch Pad Memory
small and fast local memory
Usable by user and compiler
Interrupt Unit
For fast reaction on asynchronous events
Could be a simple register which triggers one interrupt handler
Can be a sophisticated hardware machine which prioritizes different parallel interrupts and injects a jump to an interrupt handler into the instruction path of the processor
Debug
- Interface to an external debugger in order to start and stop the program and to read as well as to write memory locations
- External connection usually through a serial interface (JTAG, Uart, USB)
- Sometimes trace capability is available to observe internal processor states
Bus
High Bandwith Bus
- Bus or switch matrix to connect on chip memories, fast IOs, DMA, and caches
- 32 or 64 bit wide, pipelined, bursts, high throughput reasonable latency
Slow Bus
Low performance bus to connect configuration registers and slow peripherals, e.g. timers and serial IO
16 or 32 bit, sometimes multiplexed address and data
Low latency and sometimes special hardware to allow bit or byte writes and hardware locks
Bus Bridge
Connection of different buses
Load reduction to increase bus speed
Separation of low and high bandwidth buses
Memory
- RAM
on chip volatile(易失的) memory; erased after power down
- ROM
non-volatile memory; mask programed; boot code
- Flash,EEPROM, FRAM
non-volatile memory but erasable
MMU/MPU
MPU: Management of access rights for defined memory regions
MMU: mapping between virtual and physical addresses
DMA(direct memory access)
Bus master for transfer of data from peripherals to memory
Decreases CPU load, but may cause bus access contention(争夺)
In DMA, we use a separate single-purpose processor, called a DMA controller, whose sole purpose is to transfer data between memories and peripherals. Briefly, the peripheral requests servicing from the DMA controller, which then requests control of the system bus from the microprocessor.The microprocessor merely needs to relinquish(放手) control of the bus to the DMA controller.
在DMA中,我们用一个专处理器叫DMA controller,它负责转换memory和peripheral的数据,接受peripheral的请求再向处理器申请bus控制.
Fast IO
PCIe, Ethernet , high speed USB
External memory interfaces: DDRx, SD, MMC
Usually large send and receive buffers
Serviced by DMA
Slow IO
- Uart ,²C (IIC), SPI, full speed USB,CAN
- Small buffers usually handled through byte accesses
- May be implemented in software with GPIOs as well
I/O addressing (ADD)
parallel I/O
Portbased I/O is also called parallel I/O.
A port is a set of pins that can be read and written just like any register in the microprocessor, A C-language programmer may write to P0 using an instruction like: P0 = 255
A microprocessor may use one of two methods for communication over a system bus: standard I/O or memory-mapped I/O
standard I/O
the bus includes an additional pin, which we label M/IO, to indicate whether the access is to memory or to a peripheral. For example, when M/IO is 0, the address on the address bus corresponds to a memory address. When M/IO is 1, the address corresponds to a peripheral
- advantage is no loss of memory addresses
memory-mapped I/O
peripherals occupy specific addresses in the existing address space. For example, consider a bus with a 16-bit address. The lower 32K addresses may correspond to memory addresses, while the upper 32K may correspond to I/O addresses.
- advantage is that the microprocessor need not include special instructions for communicating with peripherals.
others
Accelerators
Special purpose hardware for dedicated tasks e.g
Crypto processors for AES
Long number calculation units for asymmetric crypto
AD/DA
Analogg-to-Digital, converters in both directions to interface with the physical world
Timer
All kinds of timers for time counting and measurement as well as pulse width modulation
Control
control unit for start-up, power management, clock control,configuration management
MMI
Man Machine Interface, Interface to displays, keyboard
HSM: Hardware Security Module
- Special hardware acting as hardware trust anchor
- Independent CPU
- Crypto accelerators
- Secret key storage
- Protecting boot sequence
- Controlling access
Two Memory Architectures
Describe von Neumann and Harvard architecture characteristics and difference.
von Neumann: Fewer memory wires
Harvard: Simultaneous program and data memory access
Regeisters in Cortex-M
Instruction of Microcontroller
Data processing
- ADD, SUB, MUL, DIV, LSL, AND, OR, …
Data Move
- LDR, STR, LDM, STM
- PUSH, POP
- MRS, MSR
Control
- B, BC, CBZ
- BL, BLX
Format of Instruction
opcode: which operation should be executed?
operand1: which register is affected?
operand2: an immediate value,a register,an immediate shifted register,a register shifted register
operand3: optional register value
16bit vs 32bit instructions
32 bit:
- Larger code size
- Larger operands
- More instructions
16 bit:
- Small code size
- Small operands
- Limited number of instructions
Mixture:
- Optimized code size and large operands
- Bits in opcode needed to distinguish 16 and 32 bit instructions
ARM specific solution for mixed length instructions
Addressing mode
- In immediate addressing, the operand field contains the data itself.
MOV Rn, #immed
- In register addressing, the operand field contains the address of a datapath register in which the data resides
- In register-indirect addressing, the operand field contains the address of a register, which in turn contains the address of a memory location in which the data resides
- In direct addressing, the operand field contains the address of a memory location in which the data resides. In indirect addressing, the operand field contains the address of a memory location, which in turn contains the address of a memory location in which the data resides.
- In inherent or implicit addressing, the particular register or memory location of the data is implicit in the opcode; for example, the data may reside in a register called the "accumulator."
Stack and Heap
stack
- Stack stores all local variables of a function
- If a subroutine is called the current state of the CPU has to be stored in the stack as well,
Heap
- The heap is a memory area which can be used to store
- Static data
- Dynamically allocated memory
- Global variables
- Usually the heap grows upwards
- Memory is not maintained automatically,
- Dynamically allocated space has to be freed
- Access to memory has to be synchronized
ARM usage of registers in C/C++ procedure calls
Build an executable in a microcontroller
ELF (Executable and Linkable Format)
- The linker reads it as an input which can be linked with other objects
- The loader interprets it as an executable program
Important ELF sections
.text
Program code.data
initialized data larger than 64 KB.bss
uninitialized data larger than 64 KB.sdata
initialized data smaller than 64 KB.sbss
uninitialized data smaller than 64 KB.rodata
non-volatile default initialization parameters and data
Main difference between embedded and PC
PC
- Programs are loaded through an operating system (OS) into the RAM of the PC
- Libraries are mostly dynamically linked
- The operating system uses a MMU to translate the virtual addresses used in a program into physical addresses of RAM
Embedded
- Programs reside in on-chip ROM/Flash memory and are directly executed from there
- Programs are statically linked sometimes including a real time OS and different tasks
Debugging
Types of bug
- Bohrbug: Deterministic, easy to find
- Heisenbug: Not deterministic, hard to reproduce, measurement changes behavior
- Schroedinbug: Invisible until detected
- Mandelbug: Looks like a Bohrbug but fixes causes a chain of new bugs and may be never terminates
How to avoid debugging
- Code review
- Static code analysis
- Compiler messages
- Agile SW development
- Test driven design
Debugging methods
- Code Instrument,assertion, printf
- simulation
- emulation
- software debugger
- hardware debugger
Hardware debugger architecture
Features characterizing debuggers
- Physical Tool Interface
- Halt after Reset
- Single Stepping
- Breakpoints
- Watchpoints (Data Address + Value Trigger)
- Trace
- Profiling
Physical tool interfaces: Main Criteria
- Tool support/standardization
- Number of pins
- R/W access while running
- Bandwidth
- Robustness
- Security
Implementation of break points
- Hardware comparators for Instruction Pointer
- Equality and/or range comparators
- Break before make or break after make
断点前/断点后执行(Break before make or break after make):断点可以选择在执行指令之前暂停(断点前执行)或在执行指令之后暂停(断点后执行)。断点前执行意味着在执行目标指令之前暂停程序执行,而断点后执行意味着在执行目标指令之后暂停程序执行。
Implementation of single stepping
Needed HW support
- Built-in single stepping
- PC range trigger with break after make
Implementation of watchpoints
- Hardware comparators for Data address,Read and/or write,Data value
- Equality and/or range comparators
- Usually restrictions like : No break before make , Only unsigned values
Implementation of tracing 跟踪
Program trace
Start or stop of trace are combined with watchpoints
Data trace
Address comparator needed to specify memory location
Instrumentation using data trace interface
- Use of debug trace interface
仪器化(Instrumentation)是一种软件工程技术,用于向软件程序中插入特定的代码或标记(称为仪器化点),以便在程序执行时收集数据、监视程序行为或改变程序的执行方式。
Issues to solve for tracing
Program flow with compression
跟踪大型程序的执行路径会产生大量的跟踪数据,因此需要对数据进行压缩以减少存储和传输的成本。通常,需要将每条指令的跟踪信息压缩到几个比特,以便将数据速率降低到可接受的范围
Trace of task switches needs to be possible
对于多任务操作系统,需要能够跟踪任务之间的切换,以便分析任务的执行顺序和上下文切换的开销
Multi-core, multi-master comes on top
High speed trace interfaces are expensive
Debug under real time conditions
Invasive debugging
Features that need to stop the processor or change the program execution flow significantly.
- Problem with peripherals interacting with physical world
Non-invasive debugging
Features that have no or very little effect on the program flow.
- Less determinism during debugging
Debugging and security
The goal of the debugger is
- Give access to all resources
- Enable modifications of memory/register contents
- Change the program flow
This can be used to hack a system!!!
- The access to the debug port must be protected , At least an authentication by password is required
- Debug access should be permanently disabled for chips in the field (实际使用的芯片)
There is an inherent conflict between security and analysis capabilities (e.g. in case of field returns处理现场返回的产品)!
Interrupts and Exception
Interrupts vs Exception
Both are causing a change in the execution flow
- Interrupt: caused by an external event
- Exception: caused by a condition that occurs within the processor.
Components of the interrupt system
Interrupt Vector Table
Apart from enabling a given interrupt, the programmer must also have the means to tell the controller which particular interrupt service routine should be called. The mapping of interrupts to ISRs is achieved with the interrupt vector table, The ATmega16 has fixed interrupt priorities, which are determined by the vector number: The smaller the vector number, the higher the interrupt’s priority
Stack when entering an interrupt or exception
xPSR
:程序状态寄存器,用于表示当前程序状态的各种标志位和控制位。
Nesting of interrupts
What determines interrupt latency?
Time to push current context on stack
memory access latency and speed of memory storing the stack
Time to fetch new instructions for interrupt service routine
memory access latency and speed of memory storing the vector table
Longest run time of any multi-cycle non-interruptable instruction
在某些处理器架构中,有些指令在执行过程中是不可中断的,即使发生了中断请求,处理器也必须完成当前正在执行的指令。这些指令可能需要多个时钟周期才能完成。在这种情况下,如果发生了中断,处理器必须等待这个指令完成才能响应中断请求。
Resource conflicts, e.g. on busses used for pending transactions
Interrupt handling times for interrupts with higher priorities
Measures to reduce interrupt latency
Tail chaining
- A 2nd same or lower priority interrupt arrives during execution phase of 1st ,The 2nd interrupt will be executed immediately after the 1st without unstacking
Late arrival
during stacking phase,The 2nd interrupt will be executed first and then the 1st (tail chaining)
Pop preemption
While unstacking, Unstacking will be stopped and 2nd will be executed immediately
Types of interrupts in systems
- Level triggered
The peripheral raises the interrupt signal to the interrupt controller – The SW(Software) needs to clear the interrupt bit in the peripheral
- Edge triggered
The peripheral generates a pulse of defined length if an interrupt has occurred – The interrupt event has to be stored in the interrupt controller – If the interrupt is executed the interrupt controller clears the flag
Interrupts versus polling
Polling
A program waits for a certain event in a peripheral and checks the value of an event flag in an endless loop
Advantages of polling
- The timing of a polling loop is very deterministic
- The whole system behavior is more deterministic
- The reaction on the event is faster than with an interrupt
Drawbacks of polling
- Waste of power (if interrupts are used CPU can be in sleep)
- No parallel execution of other tasks possible
Exceptions
An exception is a predefined interrupt resulting from
- External event (non maskable interrupt)
- Instruction (software reset, supervisor call, hypervisor call)
- Fault condition occurring during program execution
- Debug events
- System timer interrupts
Usage of exceptions
- Debugging of programming errors
- Scheduling in operating systems
- Switching between processor modes
- Reconfiguring MPU/MMU settings
Exception Types
Synchronous/Precise
Fault
the return address points to the instruction that caused the exception. The exception handler may fix the problem and then restart the program, making it look like nothing has happened
Trap
the return address points to the instruction after the one that has just completed.
Asynchronous/Imprecise
Abort
the return address is not always reliably supplied. A program which causes an abort is never meant to be continued.
同步异常是由程序执行期间的某些操作引发的,处理器能够精确地确定异常发生的位置,并提供相应的处理机制。而异步异常通常是由于外部因素引起的,处理器可能无法准确地确定异常发生的原因和位置,因此处理方式更多地是终止当前程序的执行。
Applications of interrupts
- Wakeup from sleep/power down mode
- Timer Interrupts
- Handling of communication peripherals
- Handling of coprocessors with long runtimes
- Direct Memory Access (DMA)
Boundary Errors and Control Hijacking Attacks
stack memory layout
Stack frame
A typical (function calls other functions with few (<=4) arguments) Stack Frame looks like this
include caller:
Memory Encryption
Why memory encryption?
- Protection against product piracy
- Protection against modifications of critical parameters
- Protection against feature enabling
TPM
Definition of Trust in Computer Science
- Trusted: A system that operates as expected, according to design and policy.
- Trustworthy: A system is trustworthy if it is trusted and the trust can also be guaranteed in some convincing way, such as through a formal analysis or a code review.
- Trust level: The extent to which someone who relies on a system can be confident that the system meets its specifications, i.e., that the system does what it claims to do and does not perform unwanted functions
Two alternatives for a security core
- Hardware protection: TPM as a dedicated Hardware security processor using HW separation
- Software protection: DICE as a low cost software security solution using temporal separation
DICE: Device Identifier Composition Engine
Prerequisite: Unique Device Secret (UDS) established by manufacturer
DICE Boot Code = Root of Trust for Measurement
Trusted Platform Module
TPM is a cryptographic co-processor (basically a SmartCard), with the following characteristics
- True random number generation
- Small set of cryptographic functions: Key generation, signing, encryption, hashing, MAC
- Secure storage
- Platform integrity measurement and reporting
Trusted Boot: Chain of Trust
Establishment of a chain of trust using transitive trust
- A Core Root of Trust Measurement measures its own integrity and of the next entity in the chain before this is executed
- Subsequently, each component measures the next component before hand-over control
- All obtained measurements (platform metrics) are written to a TPM
How to record the platform state?
组件 \(k\) 测量 组件\(k+1\), 把测量的结果记录再SML里面,然后Extend PCR的值,再递交给组件 \(k+1\)
RTM: Root of Trust for Measurement
Verifying the platform metrics
TPM Credentials and Keys
Endorsement Key (EK)
Issued by the entity that generates the EK as part of the manufacturing process.
Platform Credential
Identifies the platform manufacturer
Conformance Credential
Issued by an evaluation entity
Important Question
Microcontroller basic
Draw steps from a blank command line until a running program
Have a look into the *.o
, *.elf
,
*.hex
, and *.lst
files
*.0
: are compiled but unlinked versions of your source files, not human readable*.elf
: are compiled and linked programs, ready to execute on the architecture they are build for*.hex
: is the pure machine code together with address information to which address the machine instructions should be copied by the programmer, but no architecture or other meta data as in the*.elf
file is contained.*.lst
: is a human readable copy of parts of the*.elf
file. What to put in here is set by options to objdump, but usually it’s Section headers, Disassembly of the .text section interleaved with the C instructions it was compiled from.
How does the workflow for embedded programs differ from the one for regular computer programs?
- Cross-compiler instead of compiler
- Device header and device linker file needed
- Often additional libraries and drivers necessary
- Programming onto uC as another, final step
Describe and use memory mapped I/O
- Concept: “Everything is a memory after all”
- Peripherals are mapped into regular address space
- The “von Neumann” approach for peripherals
How about Port Mapped I/O
- Peripherals have their own address space
- Access via special instructions, e.g. IN, OUT instead of LD, ST
- The “Harvard” approach for peripherals
Benifit and donwsides of memory mapped I/O
Benifits:
- Allows to reuse many of the existing circuitry: Address & data busses, bus interfaces, Memory protection features
- No additional instructions required → easier instruction set
Downsides:
- Memory bus has to connect to each and every peripheral, Longer bus reduces max. clock frequency, especially when going off-chip
Benifit and donwsides of Port mapped I/O
Benifits:
- Less performance necessary → easier/smaller bus design possible
- Peripheral bus does not slow down memory access
Downsides:
- Additional complexity due to 2nd or even 3rd address space
List common peripherals(外设) and explain their purpose
- ADC (Analog-to-Digital Converter) and DAC (Digital-to-Analog Converter): VADC: Voltage Analog-to-Digital Converter.
- UART (Universal Asynchronous Receiver/Transmitter) : a communication protocol and hardware interface used for serial communication.
- USB (Universal Serial Bus)
- CAN (Controller Area Network): communication protocol used in vehicles and other industrial applications to allow different electronic components to communicate with each other.
- CCU (Clock Control Unit): manages the timing and synchronization of different components within a system.
- SCU(System Control Unit): controls power management, system reset, clock settings, and other system-level operations.
- ETH (Ethernet): a standard communication protocol used for connecting devices in a local area network (LAN)
The addresses 0x40000000 through 0x5FFFFFFF are reserved for peripherals, how many address locations for 32bit?
makes up 0.5 GiB, \(\frac{2^{29}}{4} = 134217728\) locations of 32 bit eachDescribe PWM
From digital circuits, the easiest way to produce such a signal is to let a counter count up to a certain period value, at which the counter resets and counts up again. The output of the counter is fed into a comparator that compares it against the compare value in every cycle. If a match occurs, the output flip-flop is reset respectively set in the case of the XMC4500List in which registers function arguments are passed and where the return value is placed
- R0: 1st argument, return value
- R1: 2nd argument, return value if 64 bit
- R2: 3rd argument
- R3: 4th argument
- Further arguments are passed on the stack
Distinguish caller-saved and callee-saved registers
Caller save: R0-R3, R12, PC, LR Callee save: R4-R11, SPUnderstand instructions: ADD,ADDS,SUB,MUL,DIV,NEG,NOT,AND,OR,LSL,LSR,ASR,EOR,BIC,CMP,CMN,TST,REV,B,BX,BLX,BL,CBZ,CBNZ,LDR,LDRSH,LDRSB,STR,STRH,STRB,LDM,LDMIA,STM,STMIA
see: https://developer.arm.com/documentation/qrc0006/e/RISC vs CISC
RISC:
- Reduced instruction set computer
- Discrete instructions for load / store from memory to register
- More instructions but less complex CPUs → higher clock rates
- Fixed length instructions: 16, 32 bit
CISC:
- Complex instruction set computer
- Various load / store modes integrated into each instruction
- Less instructions but more complex CPUs → lower clock rates
- Variable length instructions
Architecture of ARM Cortex-M
RSIC. Distinct load and store instructions, lacking memory addressing modes for data processing instructions (e.g. ADD), and fixed length instructions.ARM's Thumb Mode?
- If we jump to an even address, we interpret all that follows as ARM code
- If we jump to an odd address, instructions are decoded as Thumb
- Cortex-M supports only Thumb code, but indication is still kept
List the benefits from having both 16 bit and 32 bit instructions
- Smaller code size, because frequently used instructions are only 16 bit yet the diversity of instructions equals 32 bit and one can mix them directly
- Portability, because code written for small processors, which only support 16 bit instructions, can directly run on larger processors that support all instructions
Show three ways to bring a 32 bit literal into r0
Using MOV for the lower 16 bit and MOVT for the upper 16 bitJumps are done using B or BX, name the instructions to call a function
the LR needs to be updated to the address of the next instruction after the call. From quick reference card: BL or BLX do that.Name the three steps of the Cortex-M4 processor pipeline
Fetch, Decode, ExecuteCharacterization of Debug Methods: LED, Printf, Debugger, Simulation. In aspects:Requirements,Amount of Information ,Code change,Delays,Real-Time
- LED: Unused GPIO for LED, Little info, changed code, short delay, real-time
- printf: uart, medium info, changed code, delay depends on length of info, real-time
- Debugger: Debug port available, Software to control the debugger, very much info, no change, delay depends, HW breakpoints are fast
- Simulation: Simulator software, A lot info, Much slower overall,
real-time
GDB: Halt execution upon reaching function foo()
break fooGDB: Execute single line of source / assembler code
step, stepiHalt execution when variable Bytes is changed
watch BytesContinue execution for the time being
continueRemove the breakpoint number 2
delete 2Set variable counter to 7
set counter=7Print the values of variable counter / register r3
print counter
print $r3
GDB: Examine the memory at address 0x08000000 as 32 bit value in hex
x /1wx 0x08000000
GDB: show all local variables of the current function
info localsName Examples: Exception vs. Interrupt
Exception: segmentation fault, divide by 0,overflow, page fault Interrupt: disk, network, keyboard, clock for timesharingExplain what happens if an ISR updates the original variable in SRAM
The software will continue to use the old value and not even notice the value in SRAM was updated. If it is a wait loop that postpones code execution until, e.g. a certain number of bytes are received by the UART, the system will hang foreverName the keyword that can be given to a variable to avoid this issue
volatile size_t bytesReceived;
Boot Steps (Understand)
- There is a BootROM containing SSW and its the first piece of code to be executed
- SSW takes care of different startup modes, in normal mode – which is the only one we consider here – control is then handed over to FLASH
- The FLASH starts again with the interrupt vector table (IVT), but here the first entry is not the address of the reset routine, but the initial value for the main stack pointer MSP. Only the second is the reset vector – in little endian, so we need to look at 0x08000201
- The next step calls a function that we can trace as the SystemInit() function at address 0x080005cc. This function sets up the wait states for the FLASH, tells the NVIC that the IVT is at 0x08000000, and sets up the clock tree. We will not dive into this code here.
- What follows is a simple copy loop. It initializes the data section with the initialization values stored in the FLASH. To be more flexible, the copying is done in two nested loops to allow multiple blocks of data at different locations.
- After that, there is another loop which zeroes the BSS section. Again, there is a nested loop to support more than one consecutive block to clear
- A third loop calls the global constructors in case we code in C++
- Finally, we branch to main()
Memory
Indicate in which direction heap and stack grow
In which section and in which SRAM is an uninitialized global variable placed on the XMC4500? And which SRAM is stack placed?
Section is device independent the BSS. BSS is located in DSRAM1 on XMC4500. Stack on PSRAMWhich endianness is XMC4500
little endian, Least significant byte in lowest addressWhat is Heartbleed vulnerability
Heartbleed漏洞是OpenSSL中的一个安全漏洞,允许攻击者从服务器中窃取敏感信息,如私钥和用户凭据。该漏洞利用了TLS心跳扩展的缺陷,未正确验证客户端发送的有效载荷长度,导致服务器意外泄露内存中的数据。(Unchecked length -> buffer overflow)Discuss which properties the value inside the canaries needs to have
- Unpredictable and not readable for the attacker in any way
- Large enough to avoid trying out all possible values
- Ideally change upon each program invocation (a change upon each function call would make the program terribly slow)
How to configure MPU using MPUeasy
- MPUeasyPermissions : MPUeasy_None_None , MPUeasy_RW_None, MPUeasy_RW_R = 2 , MPUeasy_RW_RW = 3 ,MPUeasy_R_None = 5 , MPUeasy_R_R = 6
- baseAddress
- size
- priority: low priority on the base, as power of 2 (\(10 = 2^{10} = 1 KiB\))
Does the MPU of a Cortex-M4 protect actions by DMA peripherals like GPDMA0/1?
No. MPU is located within the CPU and can only check memory accesses performed by the CPU.Explain what problem arises from a Use-after-free bug
The memory locations might already be allocated for a different purpose. Reading from them may cause the function to perform unexpected and possibly exploitable actions. Writing to it clobbers data of the other function that the memory locations are now allocated to and may cause this code to malfunctionPropose a simple mitigation that protects against damage even if free() is called on ptr again
Several mitigations are possible, but they all have in common that it is necessary to track if the pointer is currently pointing to a malloc chunk or not (because it either was already freed or the allocation failed). The easiest way to do this is to always set the pointer to NULL when it is freed, just like malloc and calloc return a NULL pointer when the allocation failed. You may even consider to write your own implementation of free that sets the pointer to NULL after calling the original free. According to the C standard, freeing a NULL pointer does no harm. Thus double free cannot cause damage any more.Give two vulnerabilities involving the heap
Heap based buffer overflow, use-after-free, double free
List what can be achieved by controlling the format string of a
printf()
call?
Read memory, write memory, all a Turing-machine can perform
When is there a risk of code injection?
Whenever code and data is only weakly separated, e.g. in von Neumann architectures or scripting languagesWhich mode should be used in external memory encryption and disk encryption
External memory: ESSIV Disk: CTSCrypto and Side channel
Security Objective: meaning and contermeasures
- Confidentiality:Protection from non-authorized information retrieval. encryption, access control
- Integrity: Protection from non-authorized and un-noticed modification of data. Cryptographic signature (messages), write protection (stored data)
- Authenticity:proof of the identity of an object/subject. Passwords, cryptographic signatures
- Accountability: Protection from disclaiming that a performed activity was not carried out. Logging
- Availability: Protection from non-authorized interference with the usability or correct function of a system. Redundancy, overload protection
- Privacy:Protection of personal data and any data regarding the private sphere to ensure the right of self-determination. Data minimization, pseudonyms
Give example for: Symmetric cryptography(Block and Stream),Asymmetric cryptography,Cryptographic hash functions
- Block cipher: AES ( block size 128 bit, key size 128 bit, 192 bit, 256 bit)
- Stream cipher: ChaCha20 (key size 256 bit)
- Asymmetric Cryptography: RSA
- Cryptographic Hash Functions: SHA-2
Apply methods to do: Firmware encryption, Firmware image signing, Disk encryption
Firmware encryption: AES signing: RSA Disk encryption: AESAnalyze what happens if an attacker redirects the entire traffic to one of his own servers
According to the problem description, the IoT node uses whatever public key the server sent to it. Thus the IoT node will happily accept the attacker’s public key and use it to establish a signed and encrypted channel, not knowing that this channel ends in a malicious server instead of the one the IoT node expects. To avoid such attacks, the IoT node could have a factory-programmed list of public keys for the servers.
Summarize Kerckhoff’s principle
Keep the key secret, but publish the algorithm.Describe the benefit of certificates over raw public keys
You can check the signatures to avoid using an attacker’s public key instead of the legitimate one.What is a Side-Channel?
Any physical quantity related to the operation of a cryptosystem but not intended to carry informationList physical quantities which might pose a side-channel with non-zero amount of information
Time, Power, Electromagnetic emanations, Acoustic emanations, Temperature, LightWhy is implementing crypto harder than implementing other algorithms?
Because apart from working correctly, the code must also be fault tolerant, not leak via side-channelsCommunication and TPM
Collect a list of common embedded communication standards, their topo,length,frequency
- IIC: Bus, 3m, 100kHz
- IIS: P2P, 10m, any
- SPI: Star, 10m, any
- CAN: Bus, 500m, 125kHz
- UART: P2P, 10m, any, 4pin
- RS-232: P2P, 1000m, 9600Hz,
Choose: A microcontroller talks to an SD-Card
SPI or QuadSPI (SPI with four data lines), because it can run at high frequency and thus provides high throughput. There is only one master and one slave, so the effort for a dedicated slave select line is not relevant.In a data logger, 32 ADCs have to be connected to a microcontroller
IIC, because the throughput is sufficient for a data logger and it is very easy to connect a lot of slaves.100 small fertilizer nodes on a field need to exchange humidity values between each other
CAN, because it allows communication between all nodes and is robust enough to be put on a field.An industrial PC needs data from a few temperature sensors located several hundred meters away
RS-232, because it is easy to implement in a small temperature sensor node and supports long distances, especially if you choose a low baud rate. Maybe CAN if your controller has native support for it and you can afford the PHY chip.Choose: A DSP reads data from multiple ADCs and outputs to a DAC
IIS or SPI, because they provide the necessary throughput and can be used in a fully synchronous design that ensures a continuous uninterrupted stream of samples from inputs to output.Outline the security risk that arises if a device can have multiple interfaces and drivers are loaded automatically
A malicious device might look like a mouse but contain a mass storage device as well or look like a thumb drive but also enumerate as a keyboard, which starts typing some predefined key strokes to infect the host.Some systems feature autostart on USB devices. Explain the security risk.
Autostart can trigger an infection if the malware is put in the autostart path.Explain why there are generic device classes for USB
Because it allows to write generic drivers for, e.g. keyboards, mice, camerasGather a list of signals that you consider essential to, e.g. an MCU, Outline how these can be manipulated
- Power supply: glitch exactly when password is compared so it always passes
- Clock: Increase clock frequency, Similar to supply reduction introduces random errors in computations or program flow
Collect ideas how to attack ICs after opening up their case
- Lasers: This photo-current can set or reset bits in registers
List the requirements of DICE regarding storage capabilities, access control, mutability of code
- UDS: List the requirements of DICE regarding storage capabilities, access control, mutability of code
- DICE boot code: A few KiB that are either one-time programmable or only programmable via special updater in DICE boot code itself.
Outline why implementing DICE on an XMC4500 is difficult
Because you do not have a secure storage for UDS and DICE firmware, or loose privileged mode if you realize it via MPU.Decide and give reason why or why not the “unique chip ID” of the XMC4500 is an appropriate choice for the “unique device secret” of DICE.
- First of all the unique chip ID has only 128 bit, whereas the recommended length of the UDS is at least 256 bit.
- the unique chip ID contains not even close to 128 bit of entropy – despite its length.