Intro To Programming In Python 4.1 Performance PDF

INTRO TO PROGRAMMING IN PYTHON S E D G E W I C K ・ W A Y N E ・ D O N D E R O 4.1 Performance Section 4.1 https://introcs.cs.princeton.edu/python...

INTRO TO PROGRAMMING IN PYTHON S E D G E W I C K ・ W A Y N E ・ D O N D E R O 4.1 Performance Section 4.1 https://introcs.cs.princeton.edu/python Slides ported to Python by James Pope Modified by Bill Tucker & Bernd Fischer 2023 Last updated on 2022/11/17 14:25 INTRO TO PROGRAMMING IN PYTHON S E D G E W I C K ・ W A Y N E ・ D O N D E R O 4.1 Performance The challenge Empirical analysis Mathematical models Doubling method Resizing arrays and memory The challenge (since the earliest days of computing machines) “ As soon as an Analytic Engine exists, it will necessarily guide the future course of the science. Whenever any result is sought by its aid, the question will arise—By what course of calculation can these results be arrived at by the machine in the shortest time?” − Charles Babbage Difference Engine #2 Designed by Charles Q. How many times do you Babbage, c. 1848 have to turn the crank? Built by London Science Museum, 1991 3 The challenge (modern version) Q. Will I be able to use my program to solve a large practical problem? Q. And, how might I understand its performance characteristics so as to improve it? Key insight (Knuth 1970s). Use the scientific method to understand performance. 4 Three reasons to study program performance import stdio gambler.py 1. To predict program behavior import sys import stdrandom Will my program finish? def main(): When will my program finish? stake = int(sys.argv) goal = int(sys.argv) trials = int(sys.argv) 2. To compare algorithms and implementations. wins = 0 for t in range(0,trials): Will this change make my program faster? cash = stake while (cash > 0 and cash < goal): How can I make my program faster? if (stdrandom.uniform() < 0.5): cash += 1 else: cash -= 1 3. To develop a basis for understanding the if (cash == goal): wins += 1 stdio.writeln(str(wins) + ' wins of ' + str(trials) ) problem and for designing new algorithms if __name__ == '__main__': main() Enables new technology. An algorithm is a method for Enables new research. solving a problem that is suitable for implementation as a computer program. We study many algorithms later in this degree. An algorithm design success story N-body simulation Goal: Simulate gravitational interactions among N bodies. Brute-force algorithm uses N 2 steps per time unit. Issue (1970s): Too slow to address scientific problems of interest. Andrew Appel Success story: Barnes-Hut algorithm uses N logN steps and enables new research. PU '81 senior thesis N2 time limit on available time NlogN problem size (N ) 6 Another algorithm design success story Discrete Fourier transform Goal: Break down waveform of N samples into periodic components. Applications: digital signal processing, spectroscopy,... Brute-force algorithm uses N 2 steps. John Tukey Issue (1950s): Too slow to address commercial applications of interest. 1915–2000 Success story: FFT algorithm uses NlogN steps and enables new technology. N2 time limit on available time NlogN problem size (N ) 7 Quick aside: binary logarithms lg N Def. The binary logarithm of a number N (written lg N ) 4 3 is the number x satisfying 2x = N. 2 1 or log2N 0 1 2 4 8 16 Q. How many recursive calls for convert(N)? N Frequently encountered values binary.py N approximate value lgN log10N def convert(N): 210 1 thousand 10 3.01 if (N == 1): return '1' else: return convert(N//2) + str(N % 2) 220 1 million 20 6.02 230 1 billion 30 9.03 A. Largest integer less than or equal to lg N (written ⎣lg N⎦). Prove by induction. Fact. The number of bits in the binary representation of N is 1+⎣lg N⎦. Fact. Binary logarithms arise in the study of algorithms based on recursively solving problems half the size (divide-and-conquer algorithms), like convert, FFT and Barnes-Hut. 8 An algorithmic challenge: 3-sum problem Three-sum. Given N integers, enumerate the triples that sum to 0. threesum.py For simplicity, just count them. import stdio import stdarray Applications in computational geometry def countTriples(a): Find collinear points. # See next slide. Does one polygon fit inside another? def main(): Robot motion planning. a = stdarray.readInt1D() [a surprisingly long list] count = countTriples(a) stdio.writeln(count) ✗ % more 6ints.txt 30 -30 0 ✗ 30 -30 -20 -10 40 0 30 -20 -10 % python threesum.py < 6ints.txt -30 -10 40 3 Q. Can we solve this problem for N = 1 million? ✓ 9 i j k a[i] a[j] a[k] Three-sum implementation 0 1 2 30 -30 -20 0 1 3 30 -30 -10 "Brute force" algorithm i 0 1 2 3 4 5 0 1 4 30 -30 40 0 1 5 30 -30 0 Process all possible triples. a[i] 30 -30 -20 -10 40 0 0 2 3 30 -20 -10 Increment counter when sum is 0. 0 2 4 30 -20 40 0 2 5 30 -20 0 threesum.py def countTriples(a): 0 3 4 30 -10 40 n = len(a) Keep i < j < k to avoid 0 3 5 30 -10 0 count = 0 processing 0 4 5 30 40 0 for i in range(n): each triple 6 times 1 2 3 -30 -20 -10 for j in range(i+1, n): for k in range(j+1, n): 1 2 4 -30 -20 40 if (a[i] + a[j] + a[k]) == 0: 1 2 5 -30 -20 0 count += 1 1 3 4 -30 -10 40 triples return count 1 3 5 -30 -10 0 with i < j < k 1 4 5 -30 40 0 2 3 4 -20 -10 40 Q. How much time will this program take for N = 1 million? 2 3 5 -20 -10 0 2 4 5 -20 40 0 3 4 5 -10 40 0 10 INTRO TO PROGRAMMING IN PYTHON S E D G E W I C K ・ W A Y N E ・ D O N D E R O Image sources http://commons.wikimedia.org/wiki/File:Babbages_Analytical_Engine,_1834-1871._(9660574685).jpg http://commons.wikimedia.org/wiki/File:Charles_Babbage_1860.jpg http://commons.wikimedia.org/wiki/File:John_Tukey.jpg http://commons.wikimedia.org/wiki/File:Andrew_Apple_(FloC_2006).jpg http://commons.wikimedia.org/wiki/File:Hubble's_Wide_View_of_'Mystic_Mountain'_in_Infrared.jpg INTRO TO PROGRAMMING IN PYTHON S E D G E W I C K ・ W A Y N E ・ D O N D E R O 4.1 Performance The challenge Empirical analysis Mathematical models Doubling method Resizing arrays and memory A first step in analyzing running time Find representative inputs Option 1: Collect actual input data. Option 2: Write a program to generate representative inputs. Input generator for ThreeSum % python generator.py 1000000 10 import stdio 28773 import stdrandom generator.py -807569 % python generator.py 10 10 import sys -425582 -2 594752 1 600579 -4 def main(): -483784 1 #Generate N integers in [-M, M) -861312 -2 M = int(sys.argv) -690436 -10 N = int(sys.argv) -732636 -4 360294 1 stdio.writeln( N ) 0 for i in range(N): -7 number = stdrandom.uniformInt(-M, M) not much chance of a 3-sum stdio.writeln( number ) good chance of a 3-sum if __name__ == '__main__': main() 13 Empirical analysis Replace writeln() in threesum Run experiments with this code. Measure running time Start with a moderate input size N. Measure and record running time. watch = Stopwatch() count = countTriples(a) Double input size N. time = watch.elapsedTime() stdio.writef( '%d (%.0f seconds)\n', count, time ) Repeat. Tabulate and plot results. Tabulate and plot results 250 Run experiments N time (seconds) 200 250 0 % python generator.py 1000000 250 | python threesum.py time (seconds) 500 4 2 (0 seconds) 1000 31 % python generator.py 1000000 500 | python threesum.py 2000 248 7 (4 seconds) 100 % python generator.py 1000000 1000 | python threesum.py 61 (31 seconds) % python generator.py 1000000 2000 | python threesum.py 0 0 1000 2000 4000 8000 495 (248 seconds) problem size (N ) 14 Aside: experimentation in CS Experimentation is virtually free, particularly by comparison with other sciences. one million experiments one experiment Chemistry Biology Physics Computer Science Bottom line. No excuse for not running experiments to understand costs. 15 Data analysis Curve fitting N TN lgN lg TN 3.1 × 10 −8 × N 3 Plot on log-log scale. 250 0.5 8 −1 0.5 If points are on a straight line (often the case), a 500 4 9 2 4 power law holds—a curve of the form aN b fits. 1000 31 10 5 31 The exponent b is the slope of the line. 2000 248 11 8 248 Solve for a with the data. ✓ log-log plot x-intercept Do the math 8 log time (lg TN) lgTN = lga + 3lgN equation for straight line of slope 3 5 raise 2 to a power of both sides straight line TN = aN 3 of slope 3 substitute values from experiment 2 248 = a × 20003 a = 3.1 × 10 −8 solve for a −1 TN = 3.1 × 10 −8 × N 3 substitute 10 11 12 13 log problem size (lg N ) a curve that fits the data ? 16 Prediction and verification Hypothesis. Running time of ThreeSum is 3.1 × 10 −8 × N 3. Prediction. Running time for N = 4,000 will be 1982 seconds. about half an hour % python generator.py 1000000 4000 | python threesum.py 31903 (1985 seconds) ✓ Q. How much time will this program take for N = 1 million? A. 31 billion seconds (more than 983 years). 17 Another hypothesis 1970s 3.4 × 10 −4 × N 3 seconds N time (seconds) 2000000 250 5313 500 42500 1000 340000 2000 2720000 1000000 (estimated ) 0 0 1000 2000 4000 8000 VAX 11/780 3.1 × 10 −8 × N 3 seconds 250 2010s: 10,000+ times faster N time (seconds) 200 250 0 500 4 1000 31 Macbook Air 100 2000 248 0 0 1000 2000 4000 8000 Hypothesis. Running times on different computers differ by only a constant factor. 18 INTRO TO PROGRAMMING IN PYTHON S E D G E W I C K ・ W A Y N E ・ D O N D E R O Image sources http://commons.wikimedia.org/wiki/File:FEMA_-_2720_-_Photograph_by_FEMA_News_Photo.jpg http://pixabay.com/en/lab-research-chemistry-test-217041/ http://upload.wikimedia.org/wikipedia/commons/2/28/Cut_rat_2.jpg http://pixabay.com/en/view-glass-future-crystal-ball-32381/ INTRO TO PROGRAMMING IN PYTHON S E D G E W I C K ・ W A Y N E ・ D O N D E R O 4.1 Performance The challenge Empirical analysis Mathematical models Doubling method Resizing arrays and memory Mathematical models for running time Q. Can we write down an accurate formula for the running time of a computer program? A. (Prevailing wisdom, 1960s) No, too complicated. A. (D. E. Knuth, 1968–present) Yes! Determine the set of operations. Find the cost of each operation (depends on computer and system software). Find the frequency of execution of each operation (depends on algorithm and inputs). Total running time: sum of cost ´ frequency for all operations. Don Knuth 1974 Turing Award 21 Warmup: 1-sum operation cost frequency function call/return 20 ns 1 def count(a): variable declaration 2 ns 3 N = len(a) cnt = 0 assignment 1 ns 3 i=0 less than compare 1/2 ns N+1 while( i < N ): if (a[i] == 0): equal to compare 1/2 ns N cnt += 1 array access 1/2 ns N i += 1 Note that frequency of increments increment 1/2 ns between N and 2N return cnt depends on input. representative estimates (with some poetic license); knowing exact values may require study and experimentation. Q. Formula for total running time ? TN = (20x1) + (2x3) + (1x3) + (0.5xN) + (0.5) + (0.5xN) + (0.5xN) + (0.5x1.5N) A. cN + 29.5 nanoseconds, where c is between 2 and 2.5, depending on input. 22 Warmup: 2-sum operation cost frequency def count(a): function call/return 20 ns 1 N = len(a) variable declaration 2 ns N+2 cnt = 0 assignment 1 ns N+2 for i in range(N): less than compare 1/2 ns (N + 1) (N + 2)/2 for j in range(i+1, N): if (a[i] + a[j] == 0): equal to compare 1/2 ns N (N − 1)/2 cnt += 1 array access 1/2 ns N (N − 1) return cnt increment 1/2 ns between N (N + 1)/2 and N 2 exact counts tedious to derive #i

Intro To Programming In Python 4.1 Performance PDF

Document Details

Tags

Related

Summary

Full Transcript