Chapter 4: Machine Learning for IoT PDF

Chapter 4: Machine Learning for IoT Chourouk Guettas 10/10/2024 Table of contents I - Understanding IoT Constraints 3 1. Resource Limitations.............................

Chapter 4: Machine Learning for IoT Chourouk Guettas 10/10/2024 Table of contents I - Understanding IoT Constraints 3 1. Resource Limitations.................................................................................................3 2. Practical Demonstration...........................................................................................5 II - Lightweight Machine Learning Models 6 1. Model Selection Criteria for IoT................................................................................6 2. Commonly Used Lightweight Models.......................................................................8 3. Model Optimization Techniques...............................................................................9 III - Online Learning in IoT Environments 11 1. Concept and Importance........................................................................................11 2. Online Learning Algorithms: SGD............................................................................12 3. Online Learning Algorithm: Online Random Forests.............................................14 4. Online Learning Algorithm: Adaptive Algorithms..................................................16 5. Challenges and Solutions........................................................................................19 IV - Edge Computing for Machine Learning in IoT 23 1. Edge Computing Fundamentals.............................................................................23 2. Benefits of Edge AI: BLERP......................................................................................24 3. Implementing ML at the Edge.................................................................................27 V - Machine Learning for IoT: Practical Considerations 29 1. Model Selection Guidelines.....................................................................................29 2. Implementation Challenges....................................................................................32 2 Understanding IoT Constraints I Introduction Before diving into implementing machine learning models on IoT devices, it's crucial to understand the unique constraints and challenges that IoT environments present. These constraints significantly impact our approach to designing and deploying ML solutions. 1. Resource Limitations 1. Processing Power Constraints: IoT devices typically use microcontrollers or low-power processors that are far less powerful than traditional computing environments. 1. Common IoT Processors Example processors and their capabilities: ARM Cortex-M series: 48-216 MHz ESP32: Dual-core up to 240 MHz Raspberry Pi Zero: 1 GHz single-core 2. Impact on ML Implementation Limited ability to run complex models Longer inference times Need for model optimization 2. Memory Limitations IoT devices often have severely restricted memory, both RAM and storage. 1. Typical Memory Constraints RAM: Often between 32KB to 512KB Flash storage: Usually 256KB to 4MB 2. Implications for ML Models Model size limitations Restricted ability to store historical data Limited space for intermediate computations 3. Memory Optimization Strategies Model quantization Feature selection to reduce input dimensionality Streaming algorithms for data processing 3 Understanding IoT Constraints 3. Energy Consumption Considerations: Many IoT devices operate on batteries or limited power sources. 1. Power Sources in IoT Batteries Solar cells Energy harvesting 2. Energy Impact of ML Operations Model inference energy cost Data transmission energy cost Sleep/wake cycles 3. Best Practices for Energy Efficiency Batch processing when possible Reducing wireless transmissions Using low-power modes between operations 4. Network Bandwidth Restrictions: IoT devices often rely on wireless communication with limited bandwidth. 1. Common IoT Communication Protocols Bluetooth Low Energy (BLE): 1 Mbps LoRaWAN: 0.3-50 kbps ZigBee: 250 kbps 2. Bandwidth Challenges Limited ability to transmit raw data Constraints on model updates Impact on real-time processing 4 Understanding IoT Constraints Popular microcontrollers for TinyML applications 2. Practical Demonstration Practical example: Refer to the attached file : SamrtThermostat.pdf (cf. SamrtThermostat.pdf) 5 Lightweight Machine Learning Models II 1. Model Selection Criteria for IoT 1. Model Size and Complexity IoT devices often have limited storage and processing capabilities. Therefore, we need to prioritize models that have a small footprint and low computational complexity. Size: Measured in terms of memory required to store the model (e.g., kilobytes or megabytes). Complexity: Often quantified by the number of parameters or the number of floating-point operations (FLOPs) required for inference. Criterion Typical Constraints Model Size Flash memory: 32KB - 4MB Parameters Hundreds to thousands Computation Limited CPU cycles Example: Model Size Comparison 1 import numpy as np 2 from sklearn.tree import DecisionTreeRegressor 3 from sklearn.ensemble import RandomForestRegressor 4 5 # Sample data 6 X = np.array(range(24)).reshape(-1, 1) 7 y = np.array([20, 19, 18, 17, 16, 17, 19, 21, 23, 24, 25, 26, 8 27, 27, 26, 25, 24, 23, 22, 21, 21, 20, 20, 19]) 9 10 # Different models 11 models = { 12 "Decision Tree": DecisionTreeRegressor(max_depth=3), 13 "Random Forest": RandomForestRegressor(n_estimators=5, max_depth=3), 14 } 15 16 # Compare model sizes 17 def get_model_size(model): 18 import pickle 19 return len(pickle.dumps(model)) 20 21 # Train and compare 22 for name, model in models.items(): 23 model.fit(X, y) 24 size = get_model_size(model) 25 print(f"{name} size: {size} bytes") 2. Inference Speed 6 Lightweight Machine Learning Models The inference speed, determined by how fast a model can process new input data and provide predictions. It should be able to make predictions quickly, even on devices with limited processing power (speed in milliseconds or frames per second). Example: A lightweight CNN like MobileNet can process images at 30 frames per second on a mobile GPU, while a larger model like ResNet might only achieve 5 FPS on the same hardware. (Add this code to the pervious one) 1 import time 2 3 def measure_inference_time(model, X, iterations=1000): 4 start_time = time.time() 5 for _ in range(iterations): 6 model.predict(X[:1]) 7 end_time = time.time() 8 return (end_time - start_time) / iterations 9 10 # Measure and compare inference times 11 for name, model in models.items(): 12 inference_time = measure_inference_time(model, X) 13 print(f"{name} inference time: {inference_time*1000:.4f} ms") 3. Accuracy vs. Resource Trade-offs We often need to balance predictive accuracy against resource utilization. In IoT, a slightly less accurate model that runs efficiently might be preferable to a highly accurate but resource-intensive one. Use techniques like the Pareto frontier to visualize and select optimal trade-offs. Consider the specific requirements of your application (e.g., is 95% accuracy sufficient, or do you need 99%?). Example: For a smart thermostat, a simple linear regression model might provide sufficient accuracy (±1°C) while using minimal resources. A more complex model might improve accuracy to ±0.1°C but at the cost of higher energy consumption, which may not be justified for this application. 1 import numpy as np 2 from sklearn.tree import DecisionTreeRegressor 3 from sklearn.ensemble import RandomForestRegressor 4 5 # Sample data 6 X = np.array(range(24)).reshape(-1, 1) 7 y = np.array([20, 19, 18, 17, 16, 17, 19, 21, 23, 24, 25, 26, 8 27, 27, 26, 25, 24, 23, 22, 21, 21, 20, 20, 19]) 9 10 # Different models 11 models = { 12 "Decision Tree": DecisionTreeRegressor(max_depth=3), 13 "Random Forest": RandomForestRegressor(n_estimators=5, max_depth=3), 14 } 15 16 # Compare model sizes 17 def get_model_size(model): 18 import pickle 19 return len(pickle.dumps(model)) 20 21 # Train and compare 22 for name, model in models.items(): 23 model.fit(X, y) 24 size = get_model_size(model) 25 26 import time 27 7 Lightweight Machine Learning Models 28 def measure_inference_time(model, X, iterations=1000): 29 start_time = time.time() 30 for _ in range(iterations): 31 model.predict(X[:1]) 32 end_time = time.time() 33 return (end_time - start_time) / iterations 34 35 # Measure and compare inference times 36 for name, model in models.items(): 37 inference_time = measure_inference_time(model, X) 38 39 from sklearn.metrics import mean_squared_error 40 import matplotlib.pyplot as plt 41 42 def evaluate_model(model, X, y): 43 predictions = model.predict(X) 44 mse = mean_squared_error(y, predictions) 45 size = get_model_size(model) 46 inference_time = measure_inference_time(model, X) 47 return mse, size, inference_time 48 49 results = {} 50 for name, model in models.items(): 51 mse, size, infer_time = evaluate_model(model, X, y) 52 results[name] = {"MSE": mse, "Size": size, "Time": infer_time} 53 print(f"{name} MSE: {mse:.2f}, Size: {size}, Inference Time: {infer_time*1000:.4f} ms") 54 55 # Visualization of trade-offs 56 plt.figure(figsize=(10, 6)) 57 for name, metrics in results.items(): 58 plt.scatter(metrics["Size"], metrics["MSE"], label=name) 59 plt.xlabel("Model Size (bytes)") 60 plt.ylabel("Mean Squared Error") 61 plt.title("Model Size vs. Accuracy Trade-off") 62 plt.legend() 63 plt.show() 2. Commonly Used Lightweight Models 1. Decision Trees and Random Forests Decision Trees: Simple, interpretable models that make decisions based on a series of if-then rules. Random Forests: Ensembles of decision trees that often provide better accuracy while still maintaining relatively low computational requirements. Advantages: Low memory footprint Fast inference Can handle both classification and regression tasks 2. Naive Bayes Classifiers Probabilistic classifiers based on Bayes' theorem with strong (naive) independence assumptions between features. 8 Lightweight Machine Learning Models Advantages: Very low computational requirements Works well with high-dimensional data Performs well even with limited training data 3. Linear Models Logistic Regression: For binary classification tasks. Linear Support Vector Machines (SVMs): For both binary and multiclass classification. Advantages: Extremely fast inference Low memory requirements Often provide good baseline performance 4. Lightweight Neural Networks For more complex tasks that require the power of neural networks, several architectures have been developed specifically for resource-constrained environments: Compressed/Quantized Networks: Standard architectures that have been compressed or had their weights quantized to reduce size and increase inference speed. MobileNet and Similar Architectures: Designed from the ground up for mobile and embedded vision applications. Advantages: Can handle complex tasks like image recognition Significantly reduced parameter count and computational requirements compared to standard CNNs Example use case: Object detection in smart security cameras. 3. Model Optimization Techniques 1. Pruning Process of removing unnecessary weights or neurons from a neural network. Can significantly reduce model size with minimal impact on accuracy. Techniques: Magnitude-based pruning: Pruning_Technique.pdf (cf. Pruning_Technique.pdf) Structured pruning (removing entire channels or layers) 2. Quantization Reducing the precision of the weights and activations in a neural network. For example, converting 32-bit floating-point numbers to 8-bit integers. Benefits: Reduced memory footprint Faster computation, especially on hardware with integer-only arithmetic 9 Lightweight Machine Learning Models 1 def quantize_model(model, bits=8): 2 """Quantize model parameters to reduced precision""" 3 scale = (2 ** bits) - 1 4 5 if hasattr(model, 'mean') and hasattr(model, 'var'): 6 # Quantize Naive Bayes parameters 7 model.mean = np.round(model.mean * scale) / scale 8 model.var = np.round(model.var * scale) / scale 9 10 return model 11 12 # Example usage 13 model = LightweightGaussianNB() 14 model.fit(X, y) 15 quantized_model = quantize_model(model, bits=8) 3. Knowledge Distillation Training a smaller "student" model to mimic the behavior of a larger "teacher" model. Allows transfer of knowledge from complex models to simpler ones that can run on IoT devices. Knowledge Distillation.pdf (cf. Knowledge Distillation.pdf) 10 Online Learning in IoT Environments III Introduction Online learning, also known as incremental learning, is a machine learning paradigm where the model learns continuously from a stream of data, updating its parameters sequentially as new data arrives. This approach is particularly crucial in IoT environments where: Data arrives continuously from sensors Storage capacity is limited System requirements change over time Real-world conditions evolve dynamically Traditional ML vs. Online Learning Traditional ML: Training → Deployment → Static Use Online Learning: Training → Deployment → Learning continues → Model updates → Learning continues... 1. Concept and Importance 1. Definition of Online Learning Online learning algorithms process data sequentially, making them ideal for IoT applications. Key characteristics: Process one instance (or mini-batch) at a time Update model parameters immediately Discard processed data points Maintain a fixed memory footprint Adapt to changing patterns in real-time 2. Advantages for IoT Applications Memory Efficiency: Only current data point needed in memory Adaptability: Can adjust to changing patterns and environments Real-time Learning: Immediate incorporation of new information Resource Optimization: Efficient use of limited IoT resources Continuous Improvement: Model evolves with new data 11 Online Learning in IoT Environments 2. Online Learning Algorithms: SGD 1. Stochastic Gradient Descent (SGD) SGD is fundamental to online learning, updating model parameters one sample at a time: 1 def sgd_update(model, x, y, learning_rate): 2 prediction = model.predict(x) 3 error = y - prediction 4 gradient = compute_gradient(error, x) 5 model.weights += learning_rate * gradient Key considerations: 1. Learning rate scheduling: Learning rate is the "step size" the model takes when updating its parameters. Traditional fixed learning rates can be problematic in online settings: 1 # Fixed learning rate (not optimal for online learning) 2 learning_rate = 0.01 3 model.weights += learning_rate * gradient Learning rate scheduling involves dynamically adjusting the step size: 1 # Example of a simple decay schedule 2 def get_learning_rate(iteration): 3 initial_rate = 0.1 4 decay_factor = 0.1 5 return initial_rate / (1 + decay_factor * iteration) Common scheduling strategies: Step decay: Reduce rate by a factor after N iterations Exponential decay: Smoothly decrease over time Adaptive: Adjust based on performance metrics 2. Mini-batch Processing: Instead of processing one data point at a time, mini-batches process small groups: 1 # Single sample (pure online) 2 update_model(x_single, y_single) 3 4 # Mini-batch (better stability) 5 batch_size = 32 6 mini_batch_x = gather_samples(batch_size) 7 mini_batch_y = gather_labels(batch_size) 8 update_model(mini_batch_x, mini_batch_y) Benefits: More stable updates than single-sample processing Better utilization of parallel processing Reduced variance in parameter updates Still maintains online learning benefits Challenges: Need to balance batch size with memory constraints Must consider data arrival rate in IoT context 12 Online Learning in IoT Environments 3. Momentum-based Variations: Adds "inertia" to parameter updates to smooth out oscillations Particularly useful for noisy IoT sensor data 1 class SGDWithMomentum: 2 def __init__(self): 3 self.velocity = 0 4 self.momentum = 0.9 # Typical value 5 6 def update(self, gradient, learning_rate): 7 # Update velocity with momentum 8 self.velocity = self.momentum * self.velocity - learning_rate * gradient 9 # Update parameters using velocity 10 return self.velocity Common variations: Classical momentum: Helps maintain direction of updates Nesterov momentum: Looks ahead to anticipate parameter changes Adam: Combines momentum with adaptive learning rates In IoT contexts: Momentum helps handle noisy sensor data Can maintain learning stability despite intermittent data Must be tuned based on device resources and data characteristics 1 #example Combining all Three 2 class IoTOptimizer: 3 def __init__(self): 4 self.velocity = 0 5 self.momentum = 0.9 6 self.iteration = 0 7 self.batch_buffer = [] 8 9 def update(self, sample): 10 # Add to mini-batch 11 self.batch_buffer.append(sample) 12 13 if len(self.batch_buffer) >= 32: # Mini-batch size 14 # Compute gradient for mini-batch 15 gradient = compute_gradient(self.batch_buffer) 16 17 # Get scheduled learning rate 18 lr = self.get_learning_rate() 19 20 # Apply momentum update 21 self.velocity = self.momentum * self.velocity - lr * gradient 22 params_update = self.velocity 23 24 # Clear buffer and increment iteration 25 self.batch_buffer = [] 26 self.iteration += 1 27 28 return params_update 29 30 def get_learning_rate(self): 31 return 0.1 / (1 + 0.1 * self.iteration) 13 Online Learning in IoT Environments These considerations are crucial for: Maintaining learning stability Handling noisy IoT data Optimizing resource usage Achieving better convergence 3. Online Learning Algorithm: Online Random Forests Adaptation of random forests for streaming data: 1. Hoeffding Trees: Also known as Very Fast Decision Trees (VFDT) Uses the Hoeffding bound to make statistically confident splitting decisions with small samples 1 class HoeffdingTree: 2 def update(self, x, y): 3 node = self.find_leaf(x) 4 node.update_statistics(x, y) 5 6 if node.should_split(): 7 # Get best and second-best attributes 8 best_attr = node.get_best_attribute() 9 second_best = node.get_second_best_attribute() 10 11 # Calculate Hoeffding bound 12 epsilon = hoeffding_bound(1, 0.95, node.n_samples) 13 14 # Split if difference exceeds Hoeffding bound 15 if (best_attr.merit - second_best.merit) > epsilon: 16 node.split_on(best_attr) 17 18 def hoeffding_bound(range_value, confidence, n_samples): 19 """ 20 Calculate Hoeffding bound epsilon 21 range_value: Range of the random variable (e.g., info gain) 22 confidence: Desired confidence level (e.g., 0.95) 23 n_samples: Number of observations seen 24 """ 25 R = range_value 26 δ = 1.0 - confidence 27 return sqrt(R * R * log(1.0/δ) / (2.0 * n_samples)) Advantages: Makes splitting decisions with statistical guarantees Memory efficient - doesn't store all data Can process infinite streams 2. Extremely Fast Decision Trees (EFDT) Evolution of Hoeffding Trees More aggressive splitting strategy Revisits previous split decisions 14 Online Learning in IoT Environments 1 class EFDT(HoeffdingTree): 2 def update(self, x, y): 3 node = self.find_leaf(x) 4 node.update_statistics(x, y) 5 6 # Standard Hoeffding Tree split check 7 self.check_for_split(node) 8 9 # Additional EFDT improvements 10 if node.is_internal(): 11 # Re-evaluate split decision 12 current_merit = node.calculate_split_merit() 13 best_possible_merit = node.get_best_possible_split_merit() 14 15 if best_possible_merit > current_merit: 16 node.reevaluate_split() Enhancements over Hoeffding Trees: Faster adaptation to changes Better split timing Can replace poor splitting decisions More suitable for concept drift 3. Adaptive Random Forests (ARF) Ensemble method combining multiple Hoeffding Trees Includes drift detection and adaptation 1 class AdaptiveRandomForest: 2 def __init__(self, n_trees=10): 3 self.trees = [HoeffdingTree() for _ in range(n_trees)] 4 self.drift_detectors = [DriftDetector() for _ in range(n_trees)] 5 self.weights = [1.0] * n_trees 6 7 def update(self, x, y): 8 predictions = [] 9 10 for i, tree in enumerate(self.trees): 11 # Update tree with random feature subset 12 features = self.get_random_feature_subset(x) 13 tree.update(features, y) 14 15 # Check for drift 16 prediction = tree.predict(features) 17 self.drift_detectors[i].update(prediction, y) 18 19 if self.drift_detectors[i].detected_change(): 20 # Replace tree if drift detected 21 self.trees[i] = HoeffdingTree() 22 self.weights[i] = 1.0 23 else: 24 # Update weight based on performance 25 self.weights[i] *= self.get_performance_factor(prediction, y) Key features: Multiple trees for robustness Individual drift detection per tree Adaptive weighting scheme Random feature subsets 15 Online Learning in IoT Environments 1 class IoTAdaptiveRandomForest(AdaptiveRandomForest): 2 def __init__(self, n_trees=10, memory_limit_mb=100): 3 super().__init__(n_trees) 4 self.memory_limit = memory_limit_mb 5 6 def update(self, x, y): 7 # Check memory usage 8 if self.get_memory_usage() > self.memory_limit: 9 self.reduce_ensemble_size() 10 11 super().update(x, y) 12 13 def reduce_ensemble_size(self): 14 # Remove worst performing trees 15 performance = [(i, w) for i, w in enumerate(self.weights)] 16 performance.sort(key=lambda x: x) 17 18 # Keep top performing trees 19 self.trees = [self.trees[i] for i, _ in performance[-5:]] 20 self.weights = [w for _, w in performance[-5:]] Comparison: 1. Hoeffding Trees: Best for: Stable environments, limited resources Memory usage: Lowest Adaptation speed: Slowest 2. EFDT: Best for: Dynamic environments Memory usage: Moderate Adaptation speed: Fast 3. Adaptive Random Forests: Best for: Complex, changing environments Memory usage: Highest Adaptation speed: Very fast Prediction accuracy: Highest 4. Online Learning Algorithm: Adaptive Algorithms Several algorithms specifically designed for online learning in IoT: a) ADWIN (ADaptive WINdowing) Maintains a variable-size window of recent instances Automatically detects concept drift Adjusts window size based on detected changes 1 class ADWIN: 2 def __init__(self, delta=0.002): 3 self.delta = delta # Confidence level 4 self.window = [] 5 self.sum = 0 6 self.variance = 0 7 16 Online Learning in IoT Environments 8 def update(self, value): 9 # Add new element 10 self.window.append(value) 11 self.sum += value 12 13 # Check for possible cuts 14 self.detect_drift() 15 16 def detect_drift(self): 17 n = len(self.window) 18 for i in range(n): 19 # Split window into two sub-windows 20 w0 = self.window[0:i] 21 w1 = self.window[i:n] 22 23 # Calculate means of sub-windows 24 mean0 = sum(w0) / len(w0) if w0 else 0 25 mean1 = sum(w1) / len(w1) if w1 else 0 26 27 # Calculate cut threshold 28 cut_threshold = self.compute_hoeffding_bound(len(w0), len(w1)) 29 30 if abs(mean0 - mean1) > cut_threshold: 31 # Drift detected - remove older portion 32 self.window = w1 33 self.sum = sum(w1) 34 break Key features: Dynamic window sizing Statistical guarantees for drift detection Memory-efficient for IoT devices Handles gradual and sudden drift b) FLORA Family Maintains separate windows for different concepts Uses instance weighting Handles recurring concepts 1 class FLORA: 2 def __init__(self, num_concepts=3): 3 self.concept_windows = [[] for _ in range(num_concepts)] 4 self.weights = [1.0] * num_concepts 5 self.current_concept = 0 6 7 def update(self, x, y): 8 # Update current concept window 9 self.concept_windows[self.current_concept].append((x, y, 1.0)) # Initial weight 10 11 # Update instance weights 12 self.adjust_weights() 13 14 # Check for concept change 15 if self.detect_concept_change(): 16 self.handle_concept_change() 17 17 Online Learning in IoT Environments 18 def adjust_weights(self): 19 for concept in self.concept_windows: 20 for instance in concept: 21 # Decay weights of older instances 22 instance *= 0.95 # Weight decay factor 23 24 def detect_concept_change(self): 25 # Compare performance of different concept windows 26 current_performance = self.evaluate_concept(self.current_concept) 27 for i, concept in enumerate(self.concept_windows): 28 if i != self.current_concept: 29 if self.evaluate_concept(i) > current_performance: 30 return True 31 return False Features: Handles recurring concepts well Maintains concept history Weight-based instance importance Suitable for seasonal IoT data c) SAM-kNN Self-Adjusting Memory kNN Maintains short-term and long-term memory Balances current and historical patterns 1 class SAMkNN: 2 def __init__(self, max_stm_size=1000, max_ltm_size=4000): 3 self.stm = [] # Short-term memory 4 self.ltm = [] # Long-term memory 5 self.max_stm_size = max_stm_size 6 self.max_ltm_size = max_ltm_size 7 8 def update(self, x, y): 9 # Update short-term memory 10 self.stm.append((x, y)) 11 if len(self.stm) > self.max_stm_size: 12 # Transfer to long-term memory 13 self.transfer_to_ltm() 14 15 def transfer_to_ltm(self): 16 # Get oldest STM instance 17 old_instance = self.stm.pop(0) 18 19 # Check if it should be added to LTM 20 if self.is_consistent_with_ltm(old_instance): 21 self.ltm.append(old_instance) 22 23 # Maintain LTM size 24 if len(self.ltm) > self.max_ltm_size: 25 self.clean_ltm() 26 27 def predict(self, x): 28 # Combine predictions from both memories 29 stm_pred = self.knn_predict(x, self.stm) 30 ltm_pred = self.knn_predict(x, self.ltm) 31 32 # Weight predictions based on recent performance 33 stm_weight = self.get_stm_performance() 18 Online Learning in IoT Environments 34 ltm_weight = self.get_ltm_performance() 35 36 return self.combine_predictions(stm_pred, ltm_pred, stm_weight, ltm_weight) Comparison for IoT Applications: 1. ADWIN: Best for: Detecting clear concept boundaries Memory usage: Low to moderate Computation: Moderate Ideal for: Resource-constrained IoT devices 2. FLORA Family: Best for: Seasonal or cyclical patterns Memory usage: Moderate to high Computation: Moderate Ideal for: IoT systems with recurring patterns 3. SAM-kNN: Best for: Complex, mixed patterns Memory usage: High Computation: High Ideal for: Edge devices with sufficient resources 5. Challenges and Solutions 1. Concept Drift Detection: Concept drift refers to changes in the underlying data patterns or relationships over time. Imagine you're predicting energy consumption in a smart building: Types of Concept Drift: 1. Sudden Drift: Before: Energy usage = f(time_of_day, temperature) SUDDEN CHANGE (e.g., COVID lockdown) After: Energy usage = g(time_of_day, temperature) // Completely different pattern 2. Gradual Drift: Example: Gradually changing temperature impact Jan: energy = 0.7* temperature + 0.3 *time_of_day Feb: energy = 0.65 *temperature + 0.35 *time_of_day Mar: energy = 0.6 *temperature + 0.4* time_of_day 3. Recurring Drift Pattern A → Pattern B → Pattern A (seasonal changes) Summer: High correlation with temperature Winter: High correlation with daylight hours 19 Online Learning in IoT Environments 4. Incremental Drift: Small continuous changes: Day 1: y = f1(x) Day 2: y = f1.1(x) Day 3: y = f1.2(x) Why it matters in IoT: Sensor degradation changes data patterns Environmental changes affect relationships User behavior evolves over time Seasonal variations affect system performance 1 #Detection Method 2 def detect_drift(recent_data, reference_data): 3 # Statistical tests 4 p_value = statistical_test(recent_data, reference_data) 5 6 # Performance monitoring 7 current_error = compute_error(recent_data) 8 9 return p_value < threshold or current_error > error_threshold 2. Model Updating Strategies a) Sliding Window Keep fixed-size recent data window (window refers to a subset of data points that we keep in memory, typically the most recent ones.) Update model using only window data Suitable for gradual drift Example of sliding Window Fixed-Size Window (size=5): [A B C D E] → New F comes in → [B C D E F] Time-Based Window (1 hour): 10:00 - [data points...] 11:00 - Points from 10:00 get removed b) Weighted Samples Assign higher weights to recent samples Gradually decrease weights of older samples Balanced adaptation to changes c) Ensemble Approaches Maintain multiple models Update/replace models based on performance Combine predictions for robustness 1 class FixedWindow: 2 def __init__(self, size=1000): 3 self.size = size 4 self.data = [] # This is the "window" 5 20 Online Learning in IoT Environments 6 def add_data(self, new_point): 7 # Add new data point 8 self.data.append(new_point) 9 10 # Remove oldest point if window is full 11 if len(self.data) > self.size: 12 self.data.pop(0) # Remove oldest data point 13 14 # Usage example for sensor data 15 window = FixedWindow(size=100) 16 while True: 17 new_sensor_reading = get_sensor_data() 18 window.add_data(new_sensor_reading) 19 # Window always contains latest 100 readings 3. Balancing Adaptation and Stability Key considerations: Learning Rate Adjustment: Adapt learning rate based on performance Validation Windows: Use separate windows for validation Model Versioning: Maintain backup versions of well-performing models Hybrid Approaches: Combine multiple adaptation strategies 1 #Implementation example 2 class AdaptiveModel: 3 def update(self, x, y): 4 # Compute current performance 5 performance = self.evaluate_performance() 6 7 # Adjust learning rate 8 self.learning_rate = self.adjust_learning_rate(performance) 9 10 # Update model 11 if self.should_update(performance): 12 self.model.update(x, y, self.learning_rate) 13 14 # Save checkpoint if performance improves 15 if self.is_improvement(performance): 16 self.save_checkpoint() Conclusion Online learning is essential for IoT applications due to: Continuous data streams Resource constraints Dynamic environments Real-time adaptation requirements Key success factors: Proper algorithm selection Effective drift detection Balanced adaptation strategies Robust implementation practices References for Further Reading 21 Online Learning in IoT Environments 1. Gama, J., et al. "A Survey on Concept Drift Adaptation" 2. Losing, V., et al. "Interactive Online Learning" 3. Bifet, A., et al. "Machine Learning for Data Streams" 22 Edge Computing for Machine Learning in IoT IV 1. Edge Computing Fundamentals 1. Definition and Importance Edge computing moves computation and data storage closer to the data sources (IoT devices), offering several advantages: Traditional Cloud Architecture: Device → Internet → Cloud → Processing → Response (High latency, bandwidth costs, privacy concerns) Edge Computing Architecture: Device → Edge Device → Processing → Response (Low latency, reduced bandwidth, better privacy) The continuum between cloud intelligence and fully on-device intelligence Key Benefits: Reduced latency Bandwidth optimization Enhanced privacy Improved reliability Real-time processing capability 2. Edge vs. Cloud Computing 23 Edge Computing for Machine Learning in IoT Comparison Matrix: Edge Cloud Aspect Computing Computing Latency Low High Bandwidth Low High Usage Processing Limited Extensive Power Virtually Storage Limited unlimited Privacy Control High Moderate Maintenance Distributed Centralized Cost Structure Device-based Usage-based 2. Benefits of Edge AI: BLERP The benefits of edge AI, expressed as BLERP, It is useful as a filter to help decide whether edge AI is well suited for a particular application. It consists of five words: Bandwidth: IoT devices often capture more data than they have bandwidth to transmit. In many cases, though, there isn’t enough bandwidth or energy budget available to send a constant stream of data to the cloud. That means that we’ll be forced to discard most of our sensor data (edge AI deemed useful here) Latency: Transmitting data takes time. Even if you have a lot of available bandwidth it can take tens or hundreds of milliseconds for a round-trip from a device to an internet server. Economics: Connectivity costs a lot of money. Connected products are more expensive to use, and the infrastructure they rely on costs their manufacturers money. The more bandwidth required, the steeper the cost. Things get especially bad for devices deployed on remote locations that require long-range connectivity via satellite. Reliability: Connectivity costs a lot of money. Connected products are more expensive to use, and the infrastructure they rely on costs their manufacturers money. The more bandwidth required, the steeper the cost. Things get especially bad for devices deployed on remote locations that require long-range connectivity via satellite. Privacy: The theory is that if we want our technology products to be smarter and more helpful, we have to give up our data. Here’s how the BioTrac Band (hand Band for firefighters, Time magazine’s 100 best inventions of year 2021) fits the BLERP model: Bandwidth : Connectivity is limited in extreme environments where firefighters work. Latency: Health issues are time-critical and must be identified immediately. Economics: Streaming raw data from sensors would require expensive high bandwidth connections. Reliability: The device can continue to warn firefighters of potential risks even if connectivity drops, and it can function for a long time on a small battery. 24 Edge Computing for Machine Learning in IoT Privacy: Raw biosignal data can be kept on-device, with only critical information being transmitted. The roles that edge AI technologies play within these applications can be grouped into a few high-level categories: Keeping track of objects Understanding and controlling systems Understanding people and living things Generating and transforming signals Edge AI use cases for keeping track of objects Edge AI use cases for understanding and controlling systems 25 Edge Computing for Machine Learning in IoT Edge AI use cases involving people Edge AI use cases involving living things 26 Edge Computing for Machine Learning in IoT Edge AI use cases for transforming signals 3. Implementing ML at the Edge 1. Model Partitioning Strategies a) Full Edge Deployment 1 class EdgeModel: 2 def __init__(self, model_path): 3 self.model = load_quantized_model(model_path) 4 self.preprocessing = PreprocessingPipeline() 5 6 def predict(self, input_data): 7 # All processing happens on edge 8 processed_data = self.preprocessing.transform(input_data) 9 return self.model.predict(processed_data) b) Hybrid Deployment 1 class HybridModel: 2 def __init__(self): 3 self.edge_model = load_edge_model() 4 self.cloud_connector = CloudService() 5 6 def process(self, data): 7 # Initial processing on edge 8 edge_result = self.edge_model.initial_process(data) 9 10 if self.needs_cloud_processing(edge_result): 11 # Send to cloud for complex processing 12 return self.cloud_connector.process(edge_result) 13 else: 14 # Complete processing on edge 15 return self.edge_model.complete_process(edge_result) 2. Implementation Considerations 27 Edge Computing for Machine Learning in IoT Resource Management Monitor memory usage Track CPU/GPU utilization Manage power consumption Failure Handling Implement fallback mechanisms Handle network interruptions Manage device restarts Performance Monitoring Track inference time Monitor accuracy Measure resource usage 28 Machine Learning for IoT: Practical Considerations V 1. Model Selection Guidelines 1 Decision Framework for Choosing Models: When selecting machine learning models for IoT applications, we need to consider multiple factors in our decision framework: a) Resource Constraints Assessment Memory Footprint: Calculate available RAM and storage Rule of thumb: Model size should not exceed 25% of available RAM Consider both model weights and runtime memory requirements Example: If device has 256MB RAM, aim for models under 64MB Processing Power: Measure available MIPS/FLOPS MIPS (Million Instructions Per Second) and FLOPS (Floating Point Operations Per Second) Consider batch vs. real-time processing needs Factor in other concurrent processes (Consider other tasks running simultaneously on the device, sensor readings..) 1 def estimate_model_memory(model): 2 # Calculate model parameters 3 total_params = sum(p.numel() for p in model.parameters()) 4 5 # Calculate model size in MB 6 model_size_mb = (total_params * 4) / (1024 * 1024) # Assuming float32 7 8 # Calculate runtime memory (approximate) 9 batch_size = 1 10 input_size = model.input_size 11 runtime_memory = batch_size * input_size * 4 / (1024 * 1024) 12 13 return { 14 'model_size_mb': model_size_mb, 15 'runtime_memory_mb': runtime_memory, 16 'total_memory_mb': model_size_mb + runtime_memory 17 } Mathematical Model for Resource Constraints: Let M be available memory, P be processing power, and E be energy budget. For a model to be viable: M_model + M_runtime ≤ α * M_available (where α ≈ 0.25) P_required ≤ β * P_available (where β ≈ 0.7) E_inference * N ≤ E_budget (N = number of inferences) b) Latency Requirements Matrix 29 Machine Learning for IoT: Practical Considerations Total Latency = Preprocessing Time + Inference Time + Postprocessing Time Requirement Max Suitable Models Level Latency Decision Trees, Linear Real-time < 50ms Models Light CNNs, Random Near real-time < 500ms Forests Batch processing > 1s Larger Neural Networks c) Performance metrics for IoT scenarios: 1. Resource Utilization Memory usage over time CPU/GPU utilization Power consumption (mW/inference) 2. Temporal Metrics Inference latency (mean and 95th percentile: is a statistical measure that indicates the maximum latency for 95% of all requests/operations. In other words, 95% of all operations complete faster than this value, while 5% take longer.) Warm-up time Time to first prediction 3. System Impact Battery drain rate Thermal impact Network bandwidth usage Resource Efficiency Score (RES): RES = (α₁* Memory_efficiency + α₂ * CPU_efficiency + α₃ * Energy_efficiency) where α₁ + α₂ + α₃ = 1 IoT-Adjusted Accuracy (IAA): IAA = Base_accuracy * (1 - β * Resource_penalty) where Resource_penalty = (Memory_usage / Memory_limit + CPU_usage / CPU_limit) / 2 and β is the penalty factor (typically 0.1-0.3) 1 class IoTMetricsCalculator: 2 def __init__(self, memory_limit, cpu_limit, energy_limit): 3 self.memory_limit = memory_limit 4 self.cpu_limit = cpu_limit 5 self.energy_limit = energy_limit 6 7 def calculate_res(self, memory_usage, cpu_usage, energy_usage, weights=[0.3, 0.3, 0.4]): 8 memory_efficiency = 1 - (memory_usage / self.memory_limit) 9 cpu_efficiency = 1 - (cpu_usage / self.cpu_limit) 10 energy_efficiency = 1 - (energy_usage / self.energy_limit) 11 12 return (weights * memory_efficiency + 13 weights * cpu_efficiency + 14 weights * energy_efficiency) 30 Machine Learning for IoT: Practical Considerations 15 16 def calculate_iaa(self, base_accuracy, memory_usage, cpu_usage, beta=0.2): 17 resource_penalty = (memory_usage / self.memory_limit + 18 cpu_usage / self.cpu_limit) / 2 19 return base_accuracy * (1 - beta * resource_penalty) 1 import numpy as np 2 import matplotlib.pyplot as plt 3 4 def calculate_p95_manual(values): 5 # Step 1: Sort values in ascending order 6 sorted_values = sorted(values) 7 8 # Step 2: Calculate the index for 95th percentile 9 # Formula: (n * P) / 100, where P is the percentile (95 in this case) 10 n = len(sorted_values) 11 index = int((n * 95) / 100) 12 13 # Step 3: Return the value at that index 14 return sorted_values[index] 15 16 # Generate sample latency data 17 np.random.seed(42) # for reproducibility 18 latencies = np.concatenate([ 19 np.random.normal(100, 10, 950), # Normal operations 20 np.random.normal(200, 20, 50) # Some slower operations 21 ]) 22 23 # Calculate p95 both ways 24 p95_numpy = np.percentile(latencies, 95) 25 p95_manual = calculate_p95_manual(latencies) 26 27 # Visualize the distribution and p95 28 plt.figure(figsize=(12, 6)) 29 plt.hist(latencies, bins=50, density=True, alpha=0.7, color='blue', label='Latency Distribution') 30 plt.axvline(p95_numpy, color='red', linestyle='dashed', linewidth=2, label=f'P95 = {p95_numpy:.2f}ms') 31 plt.axvline(np.mean(latencies), color='green', linestyle='dashed', linewidth=2, label=f'Mean = {np.mean(latencies):.2f}ms') 32 plt.title('Latency Distribution with P95') 33 plt.xlabel('Latency (ms)') 34 plt.ylabel('Density') 35 plt.legend() 36 plt.grid(True) 37 38 # Print detailed statistics 39 print(f"Number of samples: {len(latencies)}") 40 print(f"Mean latency: {np.mean(latencies):.2f}ms") 41 print(f"Median latency: {np.median(latencies):.2f}ms") 42 print(f"P95 latency: {p95_numpy:.2f}ms") 43 print(f"Max latency: {np.max(latencies):.2f}ms") 44 45 # Show example calculation steps for a smaller sample 46 small_sample = sorted([105, 98, 112, 95, 178, 250, 102, 89, 93, 97]) 47 print("\nStep by step calculation for small sample:") 48 print(f"1. Sorted values: {small_sample}") 49 print(f"2. Number of values (n): {len(small_sample)}") 50 print(f"3. P95 index = (n * 95) / 100 = ({len(small_sample)} * 95) / 100 = {(len(small_sample) * 95) / 100:.1f}") 51 print(f"4. P95 value: {np.percentile(small_sample, 95):.2f}") 31 Machine Learning for IoT: Practical Considerations 2. Implementation Challenges 1 Hardware Selection: Popular Hardware Platforms 1. Entry Level Arduino Nano 33 BLE Sense ESP32-CAM Raspberry Pi Pico 2. Mid-Range Raspberry Pi 4 Google Coral Dev Board NVIDIA Jetson Nano 3. Industrial Grade Intel NUC Dell Edge Gateway Advantech IoT Gateways 1 def evaluate_hardware_platform(specs, requirements): 2 scores = {} 3 4 # Calculate processing power score 5 mips_score = min(1.0, specs['mips'] / requirements['mips']) 6 7 # Calculate memory score 8 memory_score = min(1.0, specs['ram'] / requirements['ram']) 9 10 # Calculate power efficiency score 11 power_score = min(1.0, requirements['power'] / specs['power']) 12 13 # Calculate cost efficiency 14 cost_score = min(1.0, requirements['budget'] / specs['cost']) 15 16 # Weighted average 17 total_score = (0.4 * mips_score + 18 0.3 * memory_score + 19 0.2 * power_score + 20 0.1 * cost_score) 21 22 return total_score 2. Software Stack Considerations a) Operating System Selection 1. Real-time Operating Systems (RTOS) FreeRTOS Zephyr Mbed OS 2. Light Linux Distributions 32 Machine Learning for IoT: Practical Considerations Raspbian Lite Ubuntu Core Yocto Project b) ML Frameworks 1. Lightweight Options TensorFlow Lite ONNX Runtime Apache TVM 2. Edge-Specific Frameworks Edge Impulse Azure IoT Edge AWS Greengrass 3. Testing and Validation in IoT Environments a) Testing Strategy 1. Unit Testing Model input/output validation Resource usage monitoring Error handling verification 2. Integration Testing End-to-end data flow Network connectivity resilience Power management behavior 3. Performance Testing Load testing under various conditions Battery life assessment Thermal stress testing b) Validation Methodology 1. Data Validation Input data quality checks Sensor calibration verification Data preprocessing validation 2. Model Validation Cross-device performance testing Environmental condition testing Long-term drift assessment 1 class IoTModelValidator: 2 def __init__(self, model, test_data): 3 self.model = model 4 self.test_data = test_data 5 33 Machine Learning for IoT: Practical Considerations 6 def stress_test(self, duration_hours=24): 7 metrics = { 8 'accuracy': [], 9 'latency': [], 10 'memory': [], 11 'temperature': [] 12 } 13 14 start_time = time.time() 15 end_time = start_time + (duration_hours * 3600) 16 17 while time.time() < end_time: 18 # Run inference 19 batch = self.test_data.get_batch() 20 21 # Measure metrics 22 metrics['accuracy'].append(self.measure_accuracy(batch)) 23 metrics['latency'].append(self.measure_latency(batch)) 24 metrics['memory'].append(self.get_memory_usage()) 25 metrics['temperature'].append(self.get_device_temperature()) 26 27 # Check for degradation 28 if self.check_degradation(metrics): 29 return False, metrics 30 31 return True, metrics 32 33 def check_degradation(self, metrics): 34 # Implementation of degradation detection 35 recent_accuracy = np.mean(metrics['accuracy'][-10:]) 36 if recent_accuracy < 0.9 * metrics['accuracy']: 37 return True 38 return False 34

Chapter 4: Machine Learning for IoT PDF

Document Details

Tags

Related

Summary

Full Transcript