Regular Expressions & Data Cleaning

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which of the these is the most accurate description of Moore's Law?

  • Energy consumption of computers halves every year.
  • The number of transistors on integrated circuits doubles approximately every two years. (correct)
  • The cost of computing doubles every year.
  • Integrated circuit speeds halve every two years.

What is a potential limitation to Moore's Law continuing indefinitely?

  • Lack of skilled engineers.
  • Decreasing interest in technological advancements.
  • The speed of light and Planck length. (correct)
  • The increasing cost of silicon.

What does it mean for a set of questions to be 'decidable' in logic?

  • There exists an algorithm that can correctly return 'true' or 'false' for all questions in the set. (correct)
  • The questions are easy to understand by anyone.
  • The questions can be answered with certainty without any calculations.
  • The questions have been debated for a long time.

Which task is considered an 'undecidable' problem, meaning no algorithm can solve it?

<p>Determining whether a program will finish running or run forever. (A)</p> Signup and view all the answers

What are the defining characteristics of an algorithm?

<p>Unambiguous, executable, and terminating. (C)</p> Signup and view all the answers

Which of these is the best description of 'pseudo code'?

<p>Informal, human-readable description of an algorithm. (B)</p> Signup and view all the answers

What are the three basic structures into which most algorithms can be broken down?

<p>Sequences, selections, loops (B)</p> Signup and view all the answers

In the context of algorithms, what is a 'sequence'?

<p>A set of actions executed in a specific order. (B)</p> Signup and view all the answers

What role does 'selection' play in an algorithm?

<p>It chooses which path to take next based on a condition. (B)</p> Signup and view all the answers

What is the purpose of a 'loop' in an algorithm?

<p>To repeat actions until a condition is met. (A)</p> Signup and view all the answers

When evaluating algorithms, what signifies a 'good' algorithm?

<p>It executes correctly, returns correct results, and is guaranteed to terminate. (C)</p> Signup and view all the answers

What is 'time complexity' when referring to an algorithm?

<p>The number of basic steps or operations it takes relative to input size. (D)</p> Signup and view all the answers

What does Big O notation, such as O(N) or O(log(N)), represent?

<p>The worst-case scenario in how an algorithm's execution time grows with input size. (A)</p> Signup and view all the answers

In time complexity, what does it mean if an algorithm is described as O(N)?

<p>Its execution time grows linearly with the input size. (C)</p> Signup and view all the answers

An algorithm has a time complexity of O(log(N)). What does this imply about its performance as the input (N) increases?

<p>The algorithm becomes slightly slower. (A)</p> Signup and view all the answers

What's the relation between algorithms and programs?

<p>An algorithm describes how to solve a problem; a program is the implementation of that algorithm. (C)</p> Signup and view all the answers

Which of the following is an example of a high-level programming language?

<p>Java (C)</p> Signup and view all the answers

Which best describes the process of 'compiling' a program?

<p>Translating high-level code into machine code all at once. (D)</p> Signup and view all the answers

What happens when a program is 'interpreted'?

<p>The code is read and executed line by line. (C)</p> Signup and view all the answers

In the context of algorithm correctness, what is critical to ensure beyond just producing the correct result?

<p>The algorithm must be guaranteed to terminate for all possible inputs. (B)</p> Signup and view all the answers

Which aspect defines 'analog' electronic computing?

<p>Employing circuits where voltage represents a continuous numerical value. (B)</p> Signup and view all the answers

What characterizes 'digital' electronic computing?

<p>Employing circuits where a voltage threshold is either met or not, representing 0 or 1. (C)</p> Signup and view all the answers

What defines 'electronic computation' in the context of computing methods?

<p>Calculations executed using circuits and electrical signals. (D)</p> Signup and view all the answers

Which of the following is considered a form of 'human computation'?

<p>Performing calculations with pencil and paper. (C)</p> Signup and view all the answers

What does it mean for an algorithm to be 'unambiguous'?

<p>No assumptions are required to execute the algorithm and it uses precise instructions. (D)</p> Signup and view all the answers

Which of the following is considered a 'soft' question within the limits of computation?

<p>Can a computer ever be able to feel love? (B)</p> Signup and view all the answers

Which of the following regular expressions could be used to reformat phone numbers from the format 519-661-2111 x87897 to 519-661-2111 ext. 87897?

<p><code>(\d{3})[-](\d{3}-\d{4}) x(\d{5})</code> (D)</p> Signup and view all the answers

Which of the following regular expressions can be used to split a name column in notepad++ into a first and last name column?

<p><code>^([a-zA-Z\-\. ]+), ([a-zA-Z\-\.]+)</code> (C)</p> Signup and view all the answers

Which regular expression removes all titles, such as Mr. and Dr., from full names?

<p><code>^(Mr|Dr\.)</code> (D)</p> Signup and view all the answers

Consider an algorithm that searches for a name in a phone book. In the worst-case scenario, which of the following search methods has a time complexity of O(N), where N is the number of entries?

<p>Linear search (A)</p> Signup and view all the answers

What is the primary advantage of using a binary search algorithm over a linear search algorithm to look up a person's phone number in a phone book?

<p>Binary search has a better time complexity for large datasets. (A)</p> Signup and view all the answers

Which of the following is an example of machine instructions?

<p><code>Get the value stored at location X, add 12 to it, and store it in location Y.</code> (A)</p> Signup and view all the answers

What does a complier do?

<p>Translates high level languagues into machine code (C)</p> Signup and view all the answers

Which is not VBA (Visual Basic for Applications)?

<p>A general use programming language for developing operating systems (B)</p> Signup and view all the answers

Flashcards

Analog electronic computing

Electronic computation using circuits where voltage represents a value.

Digital electronic computing

Electronic computation where voltage thresholds represent 0 or 1.

Moore's Law

The number of transistors on microchips doubles about every two years.

Decidability

In logic, a set of questions is "decidable" if an algorithm can correctly return true or false for all questions.

Signup and view all the flashcards

Algorithm

A step-by-step description of how to solve a problem.

Signup and view all the flashcards

Well-defined algorithm

An algorithm that is unambiguous, executable, and terminating.

Signup and view all the flashcards

Pseudo Code

Informal, high-level description of an algorithm intended for humans.

Signup and view all the flashcards

Simple Structures

Algorithm broken down into sequence, selection, and loop.

Signup and view all the flashcards

Sequence

A series of actions completed in a specific order.

Signup and view all the flashcards

Selection

Asking a question to determine the next path to take.

Signup and view all the flashcards

Loops

Repeating a process until a specific condition is satisfied.

Signup and view all the flashcards

Correctness

The ability of an algorithm to return the correct result for all inputs.

Signup and view all the flashcards

Time complexity

How many basic steps an algorithm takes relative to input size.

Signup and view all the flashcards

Programming

Creating an executable computer program to address a problem.

Signup and view all the flashcards

Programming Languages

Artificial languages for instructing computers.

Signup and view all the flashcards

Algorithm Definition

A general procedure to compute the solution to a problem.

Signup and view all the flashcards

Program Definition

A particular set of instructions in some programming language.

Signup and view all the flashcards

Study Notes

  • Assignment 2 due date is Feb. 24th by 11:59pm.
  • The due date for Assignment 2 has been extended from what is given in the course syllabus.
  • Assignment 2 covers regular expressions and cleaning data.
  • Data will be used as an example next class (week 6).
  • Survey participation is worth 100 participation points and must be done by Friday (Feb. 7th) at 11:59PM.
  • Input values can be in any unit but do not list the units (numbers only).

Notepad++ Examples

  • Substitutions are demonstrated in Notepad++.

Phone Number Reformatting Example

  • Phone numbers can be reformatted to ###-###-#### ext. ##### form.
  • Find: (\d{3})[-](\d{3}-\d{4}) x(\d{5})
  • Replace:\1-\2 ext\. \3

Name Column Split Example

  • Name column can be split into First and Last name columns.
  • Find: ^([a-zA-Z\-\. ]+), ([a-zA-Z\-\. ]+)\t
  • Replace: \2\t\1\t

Title Removal Example

  • Mr. and Dr. titles can be removed from first names.
  • ^(Mr|Dr)\. with a space at the end is used to find the titles.
  • The transformed data can be pasted into Excel.

Assigned Readings

  • Readings for the week include:
    • GCFGlobal: Algorithms Tutorial
    • GCFGlobal: Hardware and software
    • GCFGlobal: Programming Languages
    • GCFGlobal: Sequences, selections, and loops
  • VBA readings for the next few weeks can be found at Home & Learn.

What is Computation?

  • Physical objects are used such as abacus, slide rules, etc.
  • Human computation includes mental or pencil and paper arithmetic.
  • Computers were people in the 1930s – 1940s.
  • Natural computation uses ants, DNA, and molecules.
  • Analog electronic computing circuits represent numerical values with voltage.
  • Circuits provide voltages that represent computed functions like addition or integration.
  • Digital electronic computing uses circuits where a voltage threshold is either met (1) or not (0).
  • Voltages on many lines represent numbers.

Moore's Law

  • Gordon E. Moore noticed in 1965 that integrated circuit speeds doubled about every two years
  • Computer processor speed has since doubled about once every 18 months to two years from the 1950s.
  • The law has become a self-fulfilling prophecy.

Limits of Computation

  • Hard limits of computation theoretically exist due to the:
    • Speed of light
    • Planck length
  • Sets of questions are "decidable" if an algorithm correctly returns true or false for all questions in the set.
  • Undecidable problems exist where is no possible algorithm.
  • Computers cannot solve undecidable problems.
  • Examples of undecidable problems:
    • Determining if an arbitrary program with finite input will finish running.
    • Deciding whether a mathematical expression is zero.
    • Determining if two grammars describe the same language.
    • Determining if a particular kind of equation has a solution.

Algorithms

  • The algorithm must be Unambiguous, Executable, and Terminating.
  • An algorithm is a step-by-step process to solve a problem.
  • Algorithms sequence of steps.
  • Unambiguous algorithms require no "assumptions" to execute.
  • Unambiguous algorithms use precise instructions
  • Executable algorithms can be carried out in practice.
  • Terminating algorithms eventually come to an end, stop, or halt.
  • Pseudocode is an informal high-level and human-readable description, intended for humans, not machines.
  • Most algorithms can use these simple structures: Sequences, Selections, and Loops

Sequences

  • Sequences are series of actions completed in a specific order.

Selections

  • Selections ask a question to figure out the next path, known as “if” statements.

Loops

  • Loops repeat an action until a certain condition is met.

Evaluating Algorithms

  • Evaluate if an algorithm is "good".
  • Compare algorithms that have the same inputs and results.
  • Check if results are the same or are they equivalent.

Phone Book Algorithm Example

  • Use an algorithm to look up a person's phone number in a phone book.
  • Linear Search is an example, where you start on the first page and check each page.
  • Binary Search is another example, where you start halfway through and throw away half that does not contain the name and repeat.

Correctness

  • Check If the algorithm can be executed without error for all cases.
  • Check if the algorithm guaranteed to terminate for all inputs.

Time Complexity

  • Normally measures worst-case scenario of time complexity.
  • It measures many basic steps.

Programming

  • Programming involves converting an algorithm into precise instructions computers can read.
  • Programming is the process of creating an executable computer program.
  • Programming is also is known as coding or implementation.
  • Machine instructions use low-level commands.
  • Assembly code is a human-readable form of machine instructions.

Programming Languages:

  • Artificial languages give instructions people can understand.
  • Examples include JAVA, VBA, Python, C, C++, C#, FORTRAN, COBOL, R, PHP.
  • Programs are written in compiled languages.
  • The programs are translated into machine code using a compiler.
  • Java and C use compiled programming languages.
  • In interpreted languages an interpreter is used, such as with Ruby and Python.
  • Algorithms serve as the general procedure, while programs are the specific instructions.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Mastering Notepad
10 questions

Mastering Notepad

DauntlessEducation avatar
DauntlessEducation
BSIT Discussion Activity 2
36 questions
Notepad Overview and Interface
10 questions
Use Quizgecko on...
Browser
Browser