Excel As Your Database PDF 2007

Document Details

DetachableOklahomaCity

Uploaded by DetachableOklahomaCity

2007

Paul Cornell, Jr.

Tags

excel database data management microsoft excel

Summary

This book teaches users how to effectively use Microsoft Office Excel as a database system to manage and retrieve data. The book provides practical examples using Excel and explains how to perform various database functions. Users will learn about data storage, data retrieval, and data analysis within Excel. The book is aimed at beginner to intermediate level users of Excel.

Full Transcript

CYAN YELLOW MAGENTA BLACK PANTONE 123 CV www.sharexxx.net - free books & ma...

CYAN YELLOW MAGENTA BLACK PANTONE 123 CV www.sharexxx.net - free books & magazines BOOKS FOR PROFESSIONALS BY PROFESSIONALS ® THE EXPERT’S VOICE ® Companion eBook Available Excel As Your Database Excel As Your Database Dear Reader, Covers Excel 2003 and and This book shows you how to use Microsoft Office Excel as an effective database storage and retrieval system. I’ve found that many people use Excel mostly to Excel 2007 Excel perform worksheet functions such as adding, subtracting, finding the average of different sets of numbers, and so forth. But Excel can do much more. For certain types of data, Excel is an ideal data management system that is a less-expensive Author of alternative to larger computing-intensive systems, such as Microsoft Access, designed for large organizations to store sizable amounts of data. Beginning Excel What-If Data Analysis Tools: Getting If you don’t have the time or interest to master advanced data storage and Started with Goal Seek, Data data management techniques, Excel has an easy learning curve. Also, Excel Tables, Scenarios, and Solver provides data analysis features that are missing from many more-expensive A Complete Guide to data management systems. As Your PivotTables: A Visual If you want to spend less time learning fairly powerful data analysis tech- Approach niques, or if you have a limited budget or a limited set of computing resources, Accessing and Analyzing this book shows you how to quickly and confidently use Excel as a robust data Data with Microsoft Excel management system. I really enjoyed writing this book, because for the first time I am able to present in one place most of Excel’s data storage and data management features. Database This book features “Quick Start” and “Try It” sections to help you get going fast with plenty of hands-on practice. I hope you find this book to be a valuable resource as you master skills to most effectively and efficiently use Excel as your database. Paul Cornell, Jr. THE APRESS ROADMAP A Complete Guide to Excel PivotTables PivotTables: Companion eBook Recipe Book A Visual Approach Excel 2007: Excel As Your Database Beyond the Manual See last page for details Beginning Excel What-If on $10 eBook version Data Analysis SOURCE CODE ONLINE www.apress.com Join online discussions: ISBN-13: 978-1-59059-751-4 ISBN-10: 1-59059-751-6 Cornell Paul Cornell, Jr. forums.apress.com 53499 FOR PROFESSIONALS BY PROFESSIONALS ™ US $34.99 Shelve in Excel User level: 9 781590 597514 Beginner–Intermediate this print for content only—size & color not accurate spine = 0.584" 248 page count 7516FM.qxp 1/5/07 3:02 PM Page i Excel As Your Database Paul Cornell, Jr. 7516FM.qxp 1/5/07 3:02 PM Page ii Excel As Your Database Copyright © 2007 by Paul Cornell, Jr. All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval system, without the prior written permission of the copyright owner and the publisher. ISBN-13 (pbk): 978-1-59059-751-4 ISBN-10 (pbk): 1-59059-751-6 Printed and bound in the United States of America 9 8 7 6 5 4 3 2 1 Trademarked names may appear in this book. Rather than use a trademark symbol with every occurrence of a trademarked name, we use the names only in an editorial fashion and to the benefit of the trademark owner, with no intention of infringement of the trademark. Lead Editor: Jim Sumser Technical Reviewer: Judith M. Myerson Editorial Board: Steve Anglin, Ewan Buckingham, Gary Cornell, Jason Gilmore, Jonathan Gennick, Jonathan Hassell, James Huddleston, Chris Mills, Matthew Moodie, Dominic Shakeshaft, Jim Sumser, Matt Wade Project Manager: Sofia Marchant Copy Edit Manager: Nicole Flores Copy Editor: Jennifer Whipple Assistant Production Director: Kari Brooks-Copony Production Editor: Ellie Fountain Compositor: Lynn L’Heureux Proofreader: Patrick Vincent Indexer: John Collin Cover Designer: Kurt Krames Manufacturing Director: Tom Debolski Distributed to the book trade worldwide by Springer-Verlag New York, Inc., 233 Spring Street, 6th Floor, New York, NY 10013. Phone 1-800-SPRINGER, fax 201-348-4505, e-mail [email protected], or visit http://www.springeronline.com. For information on translations, please contact Apress directly at 2560 Ninth Street, Suite 219, Berkeley, CA 94710. Phone 510-549-5930, fax 510-549-5939, e-mail [email protected], or visit http://www.apress.com. The information in this book is distributed on an “as is” basis, without warranty. Although every precaution has been taken in the preparation of this work, neither the author(s) nor Apress shall have any liability to any person or entity with respect to any loss or damage caused or alleged to be caused directly or indirectly by the information contained in this work. The source code for this book is available to readers at http://www.apress.com in the Source Code/ Download section. 7516FM.qxp 1/5/07 3:02 PM Page iii Contents at a Glance About the Author................................................................. xiii About the Technical Reviewer...................................................... xv Acknowledgments............................................................... xvii INTRODUCTION............................................................... 1 CHAPTER 1 Data Basics.................................................... 9 CHAPTER 2 Define Your Data.............................................. 35 CHAPTER 3 Enter Data.................................................... 47 CHAPTER 4 Find Data..................................................... 95 CHAPTER 5 Connect to Other Databases.................................. 117 CHAPTER 6 Analyze Data................................................. 137 CHAPTER 7 Automate Repetitive Database Tasks......................... 189 INDEX....................................................................... 213 iii 7516FM.qxp 1/5/07 3:02 PM Page iv 7516FM.qxp 1/5/07 3:02 PM Page v Contents About the Author................................................................. xiii About the Technical Reviewer...................................................... xv Acknowledgments............................................................... xvii INTRODUCTION............................................................... 1 Chapter Summaries....................................... 1 Chapter 1: Data Basics................................. 1 Chapter 2: Define Your Data............................. 2 Chapter 3: Enter Data.................................. 2 Chapter 4: Find Data................................... 2 Chapter 5: Connect to Other Databases..................... 2 Chapter 6: Analyze Data................................ 3 Chapter 7: Automate Repetitive Database Tasks.............. 3 Chapter Layout........................................... 4 Reading Recommendations.................................. 4 Text Conventions.......................................... 5 System Requirements...................................... 7 Sample Data............................................. 8 CHAPTER 1 Data Basics..................................................9 1.1 Learn About Flat File Databases........................... 9 Quick Start.......................................... 9 How To............................................ 10 Tip............................................... 10 Try It.............................................. 10 1.2 Learn About Nonrelational Databases...................... 11 Quick Start......................................... 11 How To............................................ 11 Tip............................................... 12 Try It.............................................. 12 v 7516FM.qxp 1/5/07 3:02 PM Page vi vi CONTENTS 1.3 Learn About Relational Databases......................... 13 Quick Start......................................... 14 How To............................................ 14 Tip............................................... 14 Try It.............................................. 14 1.4 Normalize Data....................................... 17 Quick Start......................................... 20 How To............................................ 21 Tip............................................... 21 Try It.............................................. 22 1.5 Learn About Multidimensional Databases................... 25 Quick Start......................................... 26 How To............................................ 27 Tip............................................... 28 Try It.............................................. 28 1.6 Choose the Right Database Product........................ 31 CHAPTER 2 Define Your Data........................................... 35 2.1 Determine Your Goals, Results, or Outcomes................. 35 Quick Start......................................... 35 How To............................................ 35 Try It.............................................. 36 2.2 Determine Requirements for Collecting, Storing, Analyzing, and Maintaining Your Data................................... 36 Quick Start......................................... 37 How To............................................ 37 Try It.............................................. 38 2.3 Design Your Data...................................... 38 Quick Start......................................... 38 How To............................................ 38 Try It.............................................. 46 CHAPTER 3 Enter Data................................................... 47 3.1 Copy and Move Data................................... 47 Quick Start......................................... 47 How To............................................ 47 Try It.............................................. 49 7516FM.qxp 1/5/07 3:02 PM Page vii CONTENTS vii 3.2 Fill Data............................................. 50 Quick Start......................................... 50 How To............................................ 51 Try It.............................................. 52 3.3 Enter Data with a Data Form............................. 53 Quick Start......................................... 53 How To............................................ 54 Try It.............................................. 56 3.4 Define, Create, or Apply a Name.......................... 57 Quick Start......................................... 57 How To............................................ 58 Tip............................................... 60 Try It.............................................. 60 3.5 Format Data......................................... 62 Quick Start......................................... 62 How To............................................ 63 Tip............................................... 68 Try It.............................................. 68 3.6 Conditionally Format Data............................... 71 Quick Start......................................... 71 How To............................................ 72 Try It.............................................. 74 3.7 Protect Data......................................... 75 Quick Start......................................... 75 How To............................................ 76 Try It.............................................. 81 3.8 Insert a Formula or Function............................. 83 Quick Start......................................... 83 How To............................................ 83 Try It.............................................. 85 3.9 Validate Data......................................... 85 Quick Start......................................... 86 How To............................................ 86 Try It.............................................. 88 3.10 Import Data......................................... 91 Quick Start......................................... 91 How To............................................ 91 Tip............................................... 92 Try It.............................................. 92 7516FM.qxp 1/5/07 3:02 PM Page viii viii CONTENTS CHAPTER 4 Find Data.................................................... 95 4.1 Use Cell References.................................... 95 Tip............................................... 96 Quick Start......................................... 96 How To............................................ 96 Try It.............................................. 97 4.2 Find, Replace, or Go To Data............................. 98 Quick Start......................................... 98 How To............................................ 98 Try It............................................. 102 4.3 Use the OFFSET Worksheet Function...................... 104 Try It............................................. 105 4.4 Use the LOOKUP, HLOOKUP, VLOOKUP, INDEX, and MATCH Worksheet Functions......................... 105 The LOOKUP Function................................ 106 The HLOOKUP Function............................... 107 The VLOOKUP Function............................... 107 The INDEX Function.................................. 108 The MATCH Function................................. 109 Tip............................................... 110 Try It............................................. 111 4.5 Use the Lookup Wizard................................ 113 Quick Start........................................ 114 How To........................................... 114 Try It............................................. 115 CHAPTER 5 Connect to Other Databases.............................. 117 5.1 Create a Reusable Connection to External Data.............. 117 Quick Start........................................ 118 How To........................................... 118 Try It............................................. 119 5.2 Adjust External Data While Importing...................... 120 Quick Start........................................ 121 How To........................................... 121 Try It............................................. 122 5.3 Connect to Excel Data in Other Workbooks.................. 124 Quick Start........................................ 124 How To........................................... 124 Try It............................................. 125 7516FM.qxp 1/5/07 3:02 PM Page ix CONTENTS ix 5.4 Connect to Microsoft Office Access Data................... 125 Quick Start........................................ 126 How To........................................... 126 Try It............................................. 127 5.5 Connect to Microsoft SQL Server Data..................... 127 Quick Start........................................ 127 How To........................................... 127 Try It............................................. 129 5.6 Connect to OLAP Data in Microsoft SQL Server Analysis Services..................................... 131 Quick Start........................................ 131 How To........................................... 132 Try It............................................. 132 CHAPTER 6 Analyze Data............................................... 137 6.1 Sort Data........................................... 137 Quick Start........................................ 137 How To........................................... 137 Try It............................................. 139 6.2 Filter Data with AutoFilter............................... 140 Quick Start........................................ 140 How To........................................... 141 Try It............................................. 143 6.3 Filter Data with Advanced Criteria........................ 143 Quick Start........................................ 144 How To........................................... 144 Try It............................................. 146 6.4 Filter for Unique Data.................................. 147 How To........................................... 148 Try It............................................. 148 6.5 Subtotal Data........................................ 149 Quick Start........................................ 149 How To........................................... 149 Try It............................................. 150 6.6 Create a Data Table................................... 151 Quick Start........................................ 151 How To........................................... 152 Try It............................................. 153 7516FM.qxp 1/5/07 3:02 PM Page x x CONTENTS 6.7 Consolidate Data..................................... 154 Quick Start........................................ 154 How To........................................... 154 Try It............................................. 155 6.8 Group and Outline Data................................ 156 Quick Start........................................ 156 How To........................................... 157 Try It............................................. 158 6.9 Create a Table/List.................................... 159 Quick Start........................................ 159 How To........................................... 159 Try It............................................. 161 6.10 Create a Scenario................................... 162 Quick Start........................................ 163 How To........................................... 163 Try It............................................. 164 6.11 Perform What-If Data Analysis with Goal Seek.............. 166 Quick Start........................................ 166 How To........................................... 166 Try It............................................. 167 6.12 Perform What-If Data Analysis with Solver................. 167 Quick Start........................................ 168 Tip............................................... 168 How To........................................... 169 Try It............................................. 174 6.13 Create a PivotTable and PivotChart...................... 175 Quick Start........................................ 178 How To........................................... 179 Try It............................................. 181 6.14 Change the View of a PivotTable and PivotChart............. 183 Quick Start........................................ 184 How To........................................... 185 Try It............................................. 186 6.15 Perform Statistical Data Analysis........................ 187 Quick Start........................................ 187 Tip............................................... 187 How To........................................... 187 Try It............................................. 188 7516FM.qxp 1/5/07 3:02 PM Page xi CONTENTS xi CHAPTER 7 Automate Repetitive Database Tasks.................... 189 7.1 Use the Macro Recorder............................... 189 Quick Start........................................ 189 How To........................................... 190 Try It............................................. 192 7.2 Understand Excel Visual Basic for Applications............... 192 Quick Start........................................ 193 How To........................................... 193 Try It............................................. 195 7.3 Understand the Excel Programming Model.................. 196 Quick Start........................................ 196 How To........................................... 197 Try It............................................. 199 7.4 Automate Sorting Data................................. 199 Quick Start........................................ 199 How To........................................... 199 Try It............................................. 200 7.5 Automate Filtering Data................................ 200 Quick Start........................................ 200 How To........................................... 200 Try It............................................. 201 7.6 Automate Subtotaling Data............................. 201 Quick Start........................................ 201 How To........................................... 201 Try It............................................. 202 7.7 Automate Calculating a Worksheet Function................ 202 Quick Start........................................ 202 How To........................................... 202 Try It............................................. 202 7.8 Automate Offsets..................................... 203 Quick Start........................................ 203 How To........................................... 203 Try It............................................. 204 7.9 Automate HLOOKUP and VLOOKUP........................ 204 Quick Start........................................ 204 How To........................................... 204 Try It............................................. 206 7516FM.qxp 1/5/07 3:02 PM Page xii xii CONTENTS 7.10 Automate Creating a PivotTable and PivotChart............. 206 Quick Start........................................ 206 How To........................................... 206 Try It............................................. 208 7.11 Automate Changing the View of a PivotTable and PivotChart... 208 Quick Start........................................ 208 How To........................................... 208 Try It............................................. 209 7.12 Automate Connecting to External Data.................... 209 Quick Start........................................ 210 How To........................................... 210 Tip............................................... 211 Try It............................................. 211 INDEX....................................................................... 213 7516FM.qxp 1/5/07 3:02 PM Page xiii About the Author PAUL CORNELL, JR. has been involved with helping folks get the most out of Microsoft Office Excel for more than seven years. Paul has written two previous books about Excel for Apress and one book about Excel for Microsoft Press. He has also helped Microsoft pro- duce online documentation, written many technical articles, served as a web columnist, and blogged about the Visual Basic Language Reference for Office as well as Microsoft Visual Studio Tools for the Microsoft Office System. In his current role at Microsoft, Paul serves as a documentation manager on the Microsoft Visual Studio User Education team. He lives with his wife and two daughters among the mountains of the Pacific Northwestern United States. xiii 7516FM.qxp 1/5/07 3:02 PM Page xiv 7516FM.qxp 1/5/07 3:02 PM Page xv About the Technical Reviewer JUDITH M. MYERSON is a systems architect and engineer. Her areas of interest include middleware technologies, enterprise-wide systems, database technologies, application development, web development, software engineering, network management, servers, virtualized infrastucture, security management, information assurance, standards, RFID (radio frequency identification) technologies, and project management. Judith holds a Master of Science degree in engineering, and is a member of the Institute of Electrical and Electronics Engineers (IEEE) and ISA organiza- tions. She has reviewed/edited a number of books including Hardening Linux; Creating Client Extranets with SharePoint 2003; Microsoft SharePoint: Building Office 2003 Solutions; and Microsoft Operations Manager 2005 Field Guide. xv 7516FM.qxp 1/5/07 3:02 PM Page xvi 7516FM.qxp 1/5/07 3:02 PM Page xvii Acknowledgments I first want to thank my beautiful wife and best friend, Shelley, for close to 20 years now of constant love. I could not have completed this book without her behind-the-scenes support, understanding, counsel, and encouragement. Shell, you are awesome. A big thanks also to my two wonderful daughters, Zoe and Bailey, for their sacrifice of time so that Daddy could work on his books. Girls, you are the best kids that a dad could have. I want to acknowledge my parents, Paul and Darlean, for their ongoing love and support. I appreciate everyone at Apress who contributed to helping me produce this book, espe- cially Gary Cornell, Jim Sumser, Sofia Marchant, Jennifer Whipple, Ellie Fountain, and Tina Nielsen. Also, many thanks to my technical reviewer, Judith Myerson. I am grateful to you, the many readers who have shown such a great interest in my books over the past several years. Thank you for your continued support and feedback as I try to make each book better for you. Finally, I give the ultimate thanks to God. He is the source of my skills, my talents, and my gifts. I’m also thankful for the gift of abundant and eternal life given to me by God through his son, Jesus Christ. I don’t deserve what I’ve been given, but I’m immensely grateful for having received it, and I continually seek opportunities to share it with others. xvii 7516FM.qxp 1/5/07 3:02 PM Page xviii 7516Intro.qxp 1/5/07 3:06 PM Page 1 Introduction N aturally, Microsoft Office Excel is designed to work well with facts and figures. However, Excel can do much more than just crunch numbers. For certain types of data, Excel is an ideal database management system. Excel is very good for entering, storing, and analyzing small amounts of data. Excel is also, of course, a less expensive alternative to larger comput- ing-intensive database management systems designed for business and academic institutions to store sizable amounts of data. For those who don’t have the time or interest to study advanced data storage and data management techniques, Excel provides a much lower learning curve. Also, Excel provides data analysis features lacking in many more expensive database management systems. If you have relatively small amounts of data, want to spend a minimum amount of time learning fairly powerful data analysis techniques, or have a limited computing budget or resources, you can use Excel as your database management system, and you can use this book to help you more quickly learn Excel database management and data analysis techniques that you can put to use right away. Understanding how this book is organized and presented will help you find and learn these techniques faster. Chapter Summaries This book begins by introducing you to data basics and then moves on to help you define your data. You then learn how to enter, find, connect to, and analyze data. You also learn how to automate common data management and data tasks. Chapter 1: Data Basics Chapter 1 introduces you to the basic characteristics of various types of databases, including flat file databases, nonrelational databases, relational databases, and multidimensional data- bases. Being aware of these differences will help you better understand when you can use Excel as your database management system. You will learn how to normalize your data for easier data storage and retrieval. You will also be introduced briefly to other Microsoft database manage- ment systems such as Microsoft Office Access and Microsoft SQL Server. Knowing about these products can provide you with alternatives in case your database needs are greater than what Excel provides. 1 7516Intro.qxp 1/5/07 3:06 PM Page 2 2 INTRODUCTION Chapter 2: Define Your Data Chapter 2 provides you with strategies for determining the goals, results, or outcomes for your data. Understanding how to use these strategies will help you in turn to better determine your requirements for gathering, entering, storing, using, and analyzing your data. Once these requirements are understood, you can more efficiently design your data for best use with Excel. Chapter 3: Enter Data Chapter 3 instructs you in the basics of putting your data into Excel. This chapter covers data entry techniques such as the following: Copying and pasting data into worksheet cells. Filling repetitive or sequential data across worksheet rows or down worksheet columns. Entering data with a data form instead of directly into worksheet cells. Defining, creating, and applying named ranges to worksheet cells for easier, less error- prone data management. Formatting data and copying it across worksheet cells for more intuitive data visualiza- tion and analysis. Conditionally formatting data for even more intuitive, informative data analysis. Protecting data from intentional or inadvertent changes. Inserting functions and formulas into worksheet cells to summarize data from other cells. Validating data to help ensure that only the correct data is entered into worksheet cells. Importing data from other data sources to reduce data-entry errors. Chapter 4: Find Data Chapter 4 instructs you in techniques to locate your data. Techniques include using Excel’s Find, Replace, Go To, offset, Lookup Wizard, HLOOKUP, VLOOKUP, and query functions. Chapter 5: Connect to Other Databases Chapter 5 shows you how to use Excel to work with data in other electronic files and database management systems without actually bringing the data into Excel itself. Files and database management systems include text files, other Excel files, Microsoft Office Access, Microsoft SQL Server, Microsoft SQL Server Analysis Services, and other assorted files and database management systems. 7516Intro.qxp 1/5/07 3:06 PM Page 3 INTRODUCTION 3 Chapter 6: Analyze Data Chapter 6 provides techniques to help you gain insights and make more informed decisions based on your data. Covered data analysis techniques include the following: Sorting data. Filtering data. Subtotaling data. Creating and using data tables. Consolidating data. Grouping and outlining data. Creating and using lists. Creating and using scenarios. Using Goal Seek. Using Solver. Creating and using PivotTables and PivotCharts. Performing statistical data analysis. Chapter 7: Automate Repetitive Database Tasks Chapter 7 describes techniques for writing code that instructs Excel to repeat data entry, data analysis, and data interoperability tasks in order to improve your productivity. This chapter teaches you how to use the macro recorder to make Excel write code for you. You are also introduced to the Excel programming model and the Visual Basic code editor. Understanding this model helps you write more efficient code to make Excel do what you need. Automated tasks covered include the following: Sorting data for faster data analysis. Filtering data to show only the data you want displayed. Subtotaling data. Calculating worksheet functions such as data averages and highest and lowest data values. Using offsets and the HLOOKUP and VLOOKUP worksheet functions to locate related data in nearby worksheet cells. Creating PivotTables and PivotCharts for quicker, more robust data analysis. Changing PivotTable and PivotChart views for even more enhanced data analysis. Performing more advanced statistical data analysis. Connecting to data in other electronic files and database management systems. 7516Intro.qxp 1/5/07 3:06 PM Page 4 4 INTRODUCTION Chapter Layout With only a few minor exceptions, the sections in this book’s chapters are organized similarly to help you more quickly find the specific information that you’re looking for: The “Quick Start” portion of each section provides a summarized process, a set of keystrokes, or a set of mouse actions to more quickly perform the technique without additional information. The “How To” portion of each section expands on the information provided in the “Quick Start,” providing additional details and notes. The “Try It” portion of each section gives you an opportunity to practice the technique, using sample data where applicable. Reading Recommendations This book was written with several groups of people in mind. Based on your specific needs, the following recommendations can help guide you to the chapters that you might be more interested in: If you are a database novice, you should focus on reading Chapters 1 and 2 first. If you feel that you are fairly proficient with database basics, you can safely skip Chapters 1 and 2. If you are a home user, you are probably very interested in getting directly into learning the most important data entry and data analysis techniques. You are probably less likely to worry about perfectly designing your data, writing computer programming code, or connecting to data in other database management systems. If these interests apply to you, you should focus on reading Chapters 3, 4, and 6. If you are a business professional, but you are neither an information technology (IT) professional nor a computer programmer, you are probably interested additionally in learning about designing your data to reflect your workgroup’s data needs. You may also need to occasionally connect to data in other workgroups. As your data grows, you may need to consider working with your IT department to step up to a more expensive, more resource intensive database management system. If these situations apply to you, you should focus on reading Chapters 3 through 6. If you are an IT professional, you are most likely interested additionally in using Excel to interoperate with other more expensive, more powerful database management sys- tems. If this is the case, you should focus on reading Chapter 5. If you are a computer solution developer, you are likely very interested in writing code to make it easier for your end users to perform repetitive tasks with Excel. If this inter- ests you, you should focus on Chapter 7. 7516Intro.qxp 1/5/07 3:06 PM Page 5 INTRODUCTION 5 Text Conventions Although in many cases it is faster and easier to work in Excel by using keyboard shortcuts or right-clicking shortcuts instead of clicking menus, this book’s procedures are presented from the perspective of menus whenever possible. This is done to keep instructions brief, consistent, and predictable. Tip In Excel 2007, you can display keyboard shortcut combinations by pressing the Alt key and then pressing keys corresponding to the key letters that appear next to the menus and commands. For example, to insert a blank worksheet, press Alt, H, I, S, as shown in Figure 1. Figure 1. Pressing the Alt key in Excel 2007 as a shortcut to invoke menu commands To keep instructions brief, in Microsoft Excel 2003, menu commands are designated with the click verb and are separated by the right arrow character (➤). For example, this book substi- tutes the phrase “on the File menu, click Open” with the phrase “click File ➤ Open,” as shown in Figure 2. Figure 2. Using the phrase “click File ➤ Open” for Excel 2003 as a substitute for the phrase “on the File menu, click Open” In contrast, when you click tabs (the equivalent of menus) in Microsoft Excel 2007, groups of commands appear in a “ribbon” instead of submenus. For example, when you click the Office Button (a circular button at the top-left corner of Excel with the Office System logo icon inside of it), a submenu still appears, as shown in Figure 3. So in this case, the phrase “click Office Button ➤ Open” is still used. 7516Intro.qxp 1/5/07 3:06 PM Page 6 6 INTRODUCTION Figure 3. Clicking the Office Button in Excel 2007 is equivalent to clicking the File menu in Excel 2003. However, clicking any tab in the row of tabs directly below and next to the Office Button displays a ribbon with several commands organized by groups. For example, clicking the Home tab in the row of tabs directly below and next to the Office Button displays a Clipboard group that contains a Paste command. When you click the Paste command in the Clipboard group, a submenu appears containing commands such as Paste Special. So this book substitutes the phrase “click Home, and in the Clipboard group click Paste, and then click Paste Special” with the phrase “click Home ➤ (Clipboard) Paste ➤ Paste Special.” In this case, the word Clipboard is surrounded by parentheses as a visual indicator of where the Paste command is located, but you don’t actually click the Clipboard group, as shown in Figure 4. Figure 4. Clicking the Home ➤ (Clipboard) Paste ➤ Paste Special command in Excel 2007 7516Intro.qxp 1/5/07 3:06 PM Page 7 INTRODUCTION 7 Tip Many commands in Excel 2007 ribbon groups have an icon but no text. As you begin using Excel 2007, you may have to spend a few moments resting your mouse pointer on several of these icons to see their command names appear. For example, in the Home ribbon’s Number group, the Format Cells: Number command is represented in the lower right corner by an icon with a down arrow. When you rest your mouse pointer on that icon, as shown in Figure 5, the Format Cells: Number command screen tip appears. When you click that screen tip, the Format Cells dialog box appears with its Number tab selected. As you continue using Excel 2007, you will find these commands much faster. Figure 5. Clicking the Home (Number) ➤ Format Cells: Number down arrow opens the Format Cells dialog box. System Requirements This book was written based on the features and commands included with Microsoft Office Excel 2007 and Microsoft Office Excel 2003. For Excel 2007 system requirements, see http://office.microsoft.com/en-us/suites/ HA101668651033.aspx. For Excel 2003 system requirements, see http://www.microsoft.com/office/excel/ prodinfo/sysreq.mspx. To practice some of the techniques in Chapters 1, 5, and 7, you will need access to a com- puter with Microsoft Office Access 2007 or Microsoft Office Access 2003 and Microsoft SQL Server 2005 Express Edition or greater installed. To practice techniques related to online analyt- ical processing, you will need access to a computer with Microsoft SQL Server 2005 Standard Edition or greater installed. For system requirements, visit the following web pages: For Access 2007: http://office.microsoft.com/en-us/suites/HA101668651033.aspx For Access 2003: http://www.microsoft.com/office/access/prodinfo/sysreq.mspx For SQL Server 2005: http://www.microsoft.com/sql/prodinfo/sysreqs/default.mspx Note Unless stated otherwise, the information in this book pertains to both the 2007 and 2003 versions of Excel and Access. 7516Intro.qxp 1/5/07 3:06 PM Page 8 8 INTRODUCTION Sample Data This book provides supplementary sample data to help you complete the “Try It” exercises provided throughout this book. Electronic files containing the sample data—and supporting files where needed—are available at the Apress web site’s Source Code/Download page at http://www.apress.com/book/download.html. To help you locate the correct files for a specific “Try It” exercise within the download, the files generally follow the naming convention of ExcelDB_ChXX_YY, where XX is the chapter number, and YY is a single section number or a range of section numbers. For example, a sample file name corresponding to the first section in Chapter 6 (Section 6.1) would start with ExcelDB_Ch06_01. Likewise, a sample file name corresponding to the first three sections of Chapter 5 (Sections 5.1, 5.2, and 5.3) would start with ExcelDB_Ch05_01–03. This book’s sample Excel data is presented in Excel 97–Excel 2003 format with the file extension.xls. These files should be able to be opened in any version of Excel from Microsoft Excel 97 through Excel 2007. (These files should be able to be opened in earlier Excel versions as well, but this cannot be guaranteed.) This book’s sample Microsoft Access data is presented in Access 2003 format with the file extension.mdb. These files should be able to be opened in Access versions 2002, 2003, and 2007. These files are not guaranteed to open in earlier Access versions. This book’s sample Microsoft SQL Server data should only be able to be attached to Microsoft SQL Server 2005 instances. This data is not guaranteed to be able to be attached to earlier SQL Server versions. 7516Ch01.qxp 1/5/07 3:05 PM Page 9 CHAPTER 1 Data Basics D ata comes in many different forms. Whether the data is a personal contact history, a set of academic test scores, a catalog of products and prices, a group of scientific research facts, or a multinational corporation’s general ledger entries for the past 20 years, data can be small or large, simple or complex, and summarized or detailed. Understanding the differences between common database types—flat file databases, nonrelational databases, relational databases, and multidimensional databases—will help you decide whether to use Microsoft Office Excel, Microsoft Office Access, Microsoft SQL Server, or a similar database management system from another computer software manufacturer to enter, store, modify, and analyze your particular data. 1.1 Learn About Flat File Databases A flat file database is a single electronic text file containing a list of data records with one record per line, usually with a newline character separating each data record. Each record contains one or more data fields with each field separated by a character, known as a delimiter, such as a comma or a tab character. For example, in a list of personal contacts, each data record contains an individual contact’s information: the contact’s name, address, and phone number are each a data field. Flat file databases are ideal for storing simple data values, especially when those values are in data records with varying numbers of fields. However, flat file databases can be tough to enter data into; specifically, they are error-prone when entering multiple data field delimiters. Flat file database data records and data fields usually are consistent in their definition, layout, and data format, such as the personal contact list described earlier, but this is not strictly required. For example, in a flat file database containing a list of students and their test scores, the first data record could contain a student’s name and five numeric test score data fields, while the second data record could contain a student’s identification number and seven alphabetic test score data fields. Quick Start A flat file database can most easily be represented as an electronic text file with each data record separated usually by a newline character. For each data record, each data field in that data record is separated by a common character such as a comma or a tab character. 9 7516Ch01.qxp 1/5/07 3:05 PM Page 10 10 CHAPTER 1 DATA BASICS How To To quickly create a flat file database, use one of two ways. The first is the following: 1. Start Microsoft Notepad. 2. Type a series of data records with each data field value separated by a common character such as a comma or a tab character. 3. Press Enter after each data record. 4. Save the file. The other way is the following: 1. Start Excel. 2. Type a series of data records with each data field in a subsequent worksheet cell. 3. Enter each data record on a subsequent worksheet row. 4. Save the file. Tip You should only use flat file databases for the simplest lists of data values. Flat file databases are prone to corruption, especially when two or more users or computer programs are trying to work with the same flat file database at the same time. Flat file databases are also prone to data entry errors. If you miss entering just one delimiter in a flat file database, you increase the probability of a database management system to not be able to correctly open, display, ana- lyze, or store the data values. Try It In this exercise, you will open a flat file database in Notepad. Then you will open the same flat file database in Excel to see how Excel presents flat file data in rows and columns on a worksheet: 1. Start Microsoft Notepad. 2. Click File ➤ Open. 3. Browse to and select the ExcelDB_Ch01_01.txt file, and click Open. Notice that each data field is separated by a comma, and each data record is on a separate line. 4. Start Excel. 5. Click Office Button ➤ Open (for Excel 2007) or click File ➤ Open (for Excel 2003). In the Files of Type box, select All Files. 6. Browse to and select the ExcelDB_Ch01_01.txt file, and click Open. The Text Import Wizard appears. 7. Select the Delimited option, and then click Next. 7516Ch01.qxp 1/5/07 3:05 PM Page 11 CHAPTER 1 DATA BASICS 11 8. Clear the Tab check box, select the Comma check box, and click Finish. Notice that each data field is in a separate worksheet cell, and each data record is on its own row. 9. Quit Excel, and quit Notepad. 1.2 Learn About Nonrelational Databases The defining characteristics of a nonrelational database are that each data table (which is a collection of individual data records) in a nonrelational database is self-describing and self- contained. For example, in a nonrelational database containing a personal contact list, the contact list itself is a single data table; each contact is a data record; each contact’s first name is a data field; and each contact’s street address is another data field. Furthermore, the data field values are straightforward to understand, and the contact list does not depend on any other data tables to convey each contact’s information. Nonrelational databases are great for storing lists of data values with the following: The same number of data fields in each data record. Data values and data records that do not depend on other data tables to convey all of the information about each data record. Data values that are straightforward to understand. Data fields that are organized with similar data values grouped together. There are two key differences between flat file databases and nonrelational databases. The first key difference is that a flat file database does not need to have the same number of data fields per data record. Nonrelational databases always have the same number of data fields per data record. The second key difference between flat file databases and nonrelational databases is that flat file databases do not need to contain data field names. Nonrelational databases always contain data field names. Quick Start A nonrelational database is simply an electronic file containing the same number of data fields in each data record, and each data field has a name. Similar to a flat file database, you could represent a nonrelational database as a text file containing a set of data records, with each data record separated usually by a newline character. Each data field in a data record is separated by a common character such as a comma or a tab character. Each data record con- tains the same number of data fields. How To To quickly create a nonrelational database, use one of two ways. One way is the following: 1. Start Notepad. 2. Type a series of data field names, with each data field name separated by a common character such as a comma or a tab character, and press Enter. 7516Ch01.qxp 1/5/07 3:05 PM Page 12 12 CHAPTER 1 DATA BASICS 3. Type a series of data records with each data field value separated by a common charac- ter such as a comma or a tab character. Make sure that each data record has the same number of data field values as data field names. 4. Press Enter after each data record. 5. Save the file. The other way is the following: 1. Start Excel. 2. In the first row of a worksheet, type a series of data field names, with each data field name in a subsequent worksheet cell. 3. In the second and subsequent rows, type a series of data field values with a data field value or a null value for each data field name. 4. Enter each data record on a subsequent worksheet row. 5. Save the file. Tip A data field in a nonrelational database that contains no data value for a given data record is commonly known as a null value or a null field. Null values are commonly expressed as a blank value, the value Null, or the value N/A (for not applicable). Note that the value zero (0) is never used to convey a null value. For most data entry, storage, and analysis tasks, Excel handles flat file databases and non- relational databases the same. Try It In this exercise, you will open a nonrelational database in Notepad. Then you will open the same nonrelational database in Excel to see how Excel presents the data in rows and columns on a worksheet: 1. Start Notepad. 2. Click File ➤ Open. 3. Browse to and select the ExcelDB_Ch01_02.txt file, and click Open. Notice that the first line contains data field names; each data field is separated by a comma; each data record is on a separate line; and there are the same number of data field values for each data record. 4. Start Excel. 5. Click Office Button ➤ Open (for Excel 2007) or click File ➤ Open (for Excel 2003). In the Files of Type box, select All Files. 6. Browse to and select the ExcelDB_Ch01_02.txt file, and click Open. The Text Import Wizard appears. 7516Ch01.qxp 1/5/07 3:05 PM Page 13 CHAPTER 1 DATA BASICS 13 7. Select the Delimited option, and click Next. 8. Clear the Tab check box, select the Comma check box, and click Finish. Notice that each data field is in a separate worksheet cell; each data record is on its own row; and there are the same number of data field values for each row. Tip To see all of the data field names and data field values, click the Select All button (the blank button in the upper left corner of the worksheet), and click Home ➤ (Cells) Format ➤ AutoFit Column Width (for Excel 2007) or Format ➤ Column ➤ AutoFit Selection (for Excel 2003). 9. Quit Excel, and quit Notepad. 1.3 Learn About Relational Databases Similar to nonrelational databases discussed in the previous section, relational databases store data records in two or more data tables. However, relational databases are different than nonrelational databases in one key aspect: the data tables rely on each other to capture all of the facts and figures in the database. For example, in a nonrelational database containing cus- tomer sales history, one data table contains all of the customers’ names and addresses and all of the sales transactions for all of the customers. In contrast, in a relational database contain- ing customer sales history, one data table would contain the customers’ names and addresses, while another data table would contain all of the sales transactions for all of the customers. You should consider using relational databases for all but the simplest of data lists. Very large flat file and nonrelational databases can be slow to open, tough to search in for specific data records, and prone to data-entry errors and data corruption. There are two main benefits to using relational databases vs. nonrelational databases. The first benefit of using relational databases is the efficient use of database space. Using the example of the nonrelational database in the preceding section, there would be a lot of repeated customer names and addresses and therefore increased wasted space. The second benefit of using relational databases is the reduction of data-entry errors. Duplicating data can increase the probability of data-entry errors every time you retype the same customer names and addresses. Once you remove the repeated customer names and addresses to a sep- arate data table in a relational database, you can update the customer names and addresses in just one table. To declare relationships among data tables and cross-reference related data records in separate data tables to each other in a relational database, you use primary keys and foreign keys. A primary key is a data field containing a unique identifier—such as a sequential num- ber, a part number, a customer ID, or a Social Security number—applied to each data record in the main table, also known as the primary-key data table. A foreign key then is a data field in the related table, also known as the foreign-key data table, containing the unique identifier from the related data record in the primary-key data table. For example, in the relational data- base example in the preceding section, you could assign each customer in the customer data table a unique ID number, and include the customer’s unique ID number in each data record in the sales transactions data table for that customer. 7516Ch01.qxp 1/5/07 3:05 PM Page 14 14 CHAPTER 1 DATA BASICS Quick Start To create a relational database, create two or more data tables, and then enter data records into each data table. Make sure that each data table contains a primary-key data field and that each data record in that data table contains a unique identifier in the primary-key data field. Also, for each related data table, create a foreign-key data field, and make sure that each data record in the related data table contains a primary-key data value from the related record in the primary-key data table. How To To create a relational database in Excel, do the following: 1. Start Excel. 2. Using one worksheet per data table, enter data records into each table. 3. Make sure that each worksheet contains a primary-key data field. 4. Make sure that for each worksheet, each data record in that worksheet has a primary- key data value in the primary-key data field that is unique to that worksheet. 5. Make sure that for each worksheet with data records related to the primary-key data table worksheet, the related worksheet contains a foreign-key field. 6. Make sure that each data record in the related worksheet contains a primary-key data value in the foreign-key data field, with that primary-key data value taken from the related record in the primary-key data table worksheet. 7. Save the file. Tip Foreign-key data tables should always also contain a primary-key data field. For example, a customer data table could have a related sales transactions data table, which in turn could have a related sales products data table. In this case, the sales transactions data table would need a foreign-key data field to cross-reference unique customers to sales transactions, and the sales transactions data table would also need a primary-key data field to relate unique sales transactions to unique sales products. (Of course, the customer data table would also need a primary-key data field to uniquely identify each customer, and the sales products data table would also need a primary-key data field to uniquely identify each sales product.) Try It In this exercise, you will examine a relational database in Excel. You will then use Access to import the relational data, examine the data in Access, define data table relationships, and examine related data: 1. Start Excel. 2. Click Office Button ➤ Open (for Excel 2007) or click File ➤ Open (for Excel 2003). 7516Ch01.qxp 1/5/07 3:05 PM Page 15 CHAPTER 1 DATA BASICS 15 3. Browse to and select the ExcelDB_Ch01_03.xls file, and click Open. Notice that there are five worksheets in this workbook, one worksheet each for the Orders, Line Items, Suppliers, Products, and Salespeople data tables. In each worksheet, the primary key field ends in “PK,” and any foreign key fields end in “FK.” 4. Close the workbook. Now, import the workbook data into Access. For Access 2007, do the following: 1. Start Access. 2. Click Office Button ➤ New. 3. In the Blank Database pane, in the File Name box, type any name that’s easy for you to remember for the database, click the Browse for a Location to Put Your Database icon and select a location for the database, and then click Create. Note You may need to scroll down the screen to find the Create button if the Create button is not visible under the File Name box. 4. Click External Data ➤ (Import) Excel. 5. Click Browse, browse to and select the ExcelDB_Ch01_03.xls file, click Open, and click OK. 6. Click the Show Worksheets option, select Orders in the list of available worksheets, and then click Next. 7. Select the First Row Contains Column Headings check box, and then click Next. 8. In the Indexed list, select Yes (No Duplicates), and then click Next. 9. Select the Choose My Own Primary Key option, select Order_ID_PK, and then click Next. 10. Click Finish, and then click Close. The Orders table is imported into the Access data- base. 11. Repeat steps 4 through 10 to import the Line Items, Suppliers, Products, and Salespeo- ple worksheets into the Access database. Be sure to substitute in step 9 the values Line_ID_PK, Supplier_ID_PK, Product_ID_PK, and Salesperson_ID_PK for Order_ ID_PK as appropriate. You can check your results against the imported worksheets in the finished ExcelDB_Ch01_03.mdb database file. 12. Open each of the tables in Access to ensure that the data in the Orders, Line Items, Suppliers, Products, and Salespeople data tables match the data in the Excel work- book. You can check your results against the imported worksheets in the finished ExcelDB_Ch01_03.mdb database file if needed. 7516Ch01.qxp 1/5/07 3:05 PM Page 16 16 CHAPTER 1 DATA BASICS For Access 2003, do the following: 1. Start Access. 2. Click File ➤ New. 3. In the New File task pane, click Blank Database, type any name that’s easy for you to remember for the database in the File Name box, browse to a location to put your database, and then click Create. 4. Click File ➤ Get External Data ➤ Import. 5. In the Files of Type list, select Microsoft Excel. 6. Browse to and select the ExcelDB_Ch01_03.xls file, and click Import. 7. Select the Show Worksheets option, select Orders in the list of available worksheets, and then click Next. 8. With the First Row Contains Column Headings check box selected, click Next. 9. With the In a New Table option selected, click Next. 10. In the Indexed list, select Yes (No Duplicates), and click Next. 11. Select the Choose My Own Primary Key option, select Order_ID_PK, and click Next. 12. Click Finish, and click OK. The Orders table is imported into the Access database. 13. Repeat steps 4 through 12 to import the Line Items, Suppliers, Products, and Salespeo- ple worksheets into the Access database. Be sure to substitute in step 11 the values Line_ID_PK, Supplier_ID_PK, Product_ID_PK, and Salesperson_ID_PK for Order_ ID_PK as appropriate. You can check your results against the imported worksheets in the finished ExcelDB_Ch01_03.mdb database file. 14. Open each of the tables in Access to ensure that the data in the Orders, Line Items, Suppliers, Products, and Salespeople data tables match the data in the Excel work- book. You can check your results against the imported worksheets in the finished ExcelDB_Ch01_03.mdb database file if needed. Next, create relationships among the data tables in Access: 1. For Access 2007, click Database Tools ➤ (Show/Hide) Relationships. For Access 2003, click Tools ➤ Relationships. 2. On the Show Table dialog box’s Tables tab, with the Line Items data table selected, click Add. Repeat this step for the Orders, Products, Salespeople, and Suppliers data tables. Then click Close. 3. In the Orders data table, drag the Order_ID_PK data field to the Line Items data table’s Order_ID_FK data field. 7516Ch01.qxp 1/5/07 3:05 PM Page 17 CHAPTER 1 DATA BASICS 17 Note Be sure to close all of the open data tables in Access before you complete the preceding step. 4. In the Edit Relationships dialog box, select the Enforce Referential Integrity check box, and then click Create. Note Selecting the Enforce Referential Integrity check box ensures that Access will prevent you from deleting a data record in the primary data table when there are matching data records in a related data table. This prevents you from having “stranded” or “orphaned” data in related data tables. 5. Repeat steps 3 and 4 for the following data fields: In the Products data table, drag the Product_ID_PK data field to the Line Items data table’s Product_ID_FK data field. In the Salespeople data table, drag the Salesperson_ID_PK data field to the Orders data table’s Salesperson_ID_FK data field. In the Suppliers data table, drag the Supplier_ID_PK data field to the Products data table’s Supplier_ID_FK data field. You can check your results against the finished ExcelDB_Ch01_03.mdb database file. 6. Click Office Button ➤ Save (for Excel 2007) or File ➤ Save (for Excel 2003). 7. Close the Relationships window. Now that you have data table relationships defined, drill down into one of the supplier’s sales order details in Access. 1. Open the Suppliers data table. 2. Click the plus sign symbol next to the Acme data row. 3. Click the plus sign symbols next to the two products that are displayed to discover how many units were ordered on which orders. 4. Quit Access, and quit Excel. 1.4 Normalize Data Relational databases work best when data is normalized. When you normalize your data, you eliminate redundant data to help protect your data against data entry errors. You also ensure that the information in each data table is correctly linked so that you can properly cross- reference related data. 7516Ch01.qxp 1/5/07 3:05 PM Page 18 18 CHAPTER 1 DATA BASICS You normalize data when you have a lot of repetitive data in one or more data tables and you want to restructure the data to reduce data entry errors and possibly reduce data storage requirements. To normalize data, you should follow a set of well-established rules called normal forms. There are three common normal forms. There are also several less common normal forms that are beyond the scope of this book. The general strategies underlying the three common normal forms are the following: Eliminate repeating data in rows or data records. Eliminate repeating data in columns or data fields, moving the repeated data to other data tables. Use primary keys and foreign keys to cross-reference related data records among data tables. For example, examine the following nonnormalized data in Table 1-1. Table 1-1. Nonnormalized Weather Data for Three United States Cities City, State Date 1 High Low Air Date 2 High Low Air Quality Quality Portland, 15-Feb 47 30 Moderate 16-Feb 45 26 Moderate Oregon Portland, 17-Feb 33 23 Good 18-Feb 39 27 Good Oregon Salem, 15-Feb 47 27 Moderate 16-Feb 44 23 Moderate Oregon Salem, 17-Feb 31 22 Good 18-Feb 39 23 Good Oregon Spokane, 15-Feb 35 18 Good 16-Feb 23 2 Good Washington Spokane, 17-Feb 20 10 Good 18-Feb 32 14 Good Washington Notice the following facts in the preceding data table: The cities and states are contained in the same data field, with several duplicate cities and states listed. The date, high temperature, low temperature, and air quality data fields are presented in a peculiar manner: the weather for four dates is presented in more than four data records; and three city and state combinations are presented in more than three records. Many air quality data field values are repeated. 7516Ch01.qxp 1/5/07 3:05 PM Page 19 CHAPTER 1 DATA BASICS 19 By moving repeating data to other data tables and linking the data tables together through primary keys and foreign keys, you could present the data in Tables 1-2 through 1-7. Table 1-2. Cities Data Table for Normalized Weather Data from Table 1-1 City_ID_PK City 1 Portland 2 Salem 3 Spokane Table 1-3. States Data Table for Normalized Weather Data from Table 1-1 State_ID_PK State 1 Oregon 2 Washington Table 1-4. Cities States Data Table for Normalized Weather Data from Table 1-1 City_State_ID_PK City_ID_FK State_ID_FK 1 1 1 2 2 1 3 3 2 Table 1-5. Dates Data Table for Normalized Weather Data from Table 1-1 Date_ID_PK Date 1 15-Feb 2 16-Feb 3 17-Feb 4 18-Feb Table 1-6. Air Qualities Data Table for Normalized Weather Data from Table 1-1 Air_Quality_ID_PK Air_Quality 1 Moderate 2 Good 7516Ch01.qxp 1/8/07 8:48 PM Page 20 20 CHAPTER 1 DATA BASICS Table 1-7. Weather Data Data Table for Normalized Weather Data from Table 1-1 Data_Record_ID_PK Date_ID_FK City_State_ID_FK High Low Air_Quality_ID_FK 1 1 1 47 30 1 2 2 1 45 26 1 3 3 1 33 23 2 4 4 1 39 27 2 5 1 2 47 27 1 6 2 2 44 23 1 7 3 2 31 22 2 8 4 2 39 23 2 9 1 3 35 18 2 10 2 3 23 2 2 11 3 3 20 10 2 12 4 3 32 14 2 Normalizing this data results in the following benefits: The Cities, States, and Cities States data tables are extendable to allow a city with the same name to exist in multiple states. If a city or state changes its name, you only need to change a record in the Cities or States data table. If the representation of a date needs to change (for example, changing 15-Feb to 02/15 or 15/02), you only need to change data records in the Dates data table. If the air quality categories change, you only need to change data records in the Air Qualities data table. As an added side benefit, sorting and averaging weather data is a bit more straightforward in the normalized Weather Data data table. In the nonnormalized data for example, averaging high temperatures for cities in Oregon for February 15 is more complicated: first you must filter for all rows where Oregon is somewhere in the City, State data field, then you must somehow collect all of the High data field values together where the corresponding Date 1 or Date 2 data field is 15-Feb (which is tough for many database management systems to do automatically), then you calculate the average high temperature. In the normalized Weather Data data table, you filter for all rows where the Date_ID_FK data value contains a matching data value in the Dates data table corresponding to 15-Feb and where the City_State_ID_FK data value contains a matching value in the Cities States and States data tables corresponding to Oregon; then you average the values in the High data field. Quick Start To normalize repetitive data, you eliminate the repeating data in data records and data fields, moving the repeating data to other data tables. You then use primary keys and foreign keys to cross-reference related data records among those data tables. 7516Ch01.qxp 1/5/07 3:05 PM Page 21 CHAPTER 1 DATA BASICS 21 How To To normalize data in one or more existing data tables, do the following: 1. Identify data fields with repeating data values or multipart data values (for example, contact name and address data values or product name and manufacturer data values contained in the same data field). Break these data values into multiple data fields (for example, separate data fields for name, address, product name, or manufacturer data values). 2. Group data fields with related data values into separate data tables (for example, a data table for contacts, a data table for products, or a data table for manufacturers). 3. Eliminate repeating data values in each data table (for example, a repeated address or a repeated product name). 4. Assign a primary key data field to each data table and a unique identifier for each data record in that data table (for example, a unique contact identification number or a unique product part number). 5. Add foreign key data fields as needed to cross-reference related data records contained in multiple data tables (for example, foreign key data fields describing the relationships between products and manufacturers, cross-referencing primary key data values in the separate product and manufacturer data tables). 6. Create additional data tables and use foreign keys as needed to store data records con- taining unique facts and figures (for example, a product sales transaction data table containing individual sales transaction details, cross-referencing primary key data val- ues in the product/manufacturer data table). Tip A one-to-many relationship between two data tables is the most common type of relationship. A one-to-many relationship exists when a data record in data table A can have many matching data records in another data table B, but a data record in data table B has only one matching data record in data table A. For example, a sales order in one data table can have many match- ing sales line items in another data table, but each sales line item matches only one sales order. A less frequent but still common type of relationship, a many-to-many relationship, exists between two data tables when a data record in table A can have many matching records in data table B, and a record in data table B can have many matching data records in data table A. A many-to-many relationship is made possible by creating a third table, called a junction table, that contains foreign keys from both data tables A and B. A many-to-many relationship is really two one-to-many relationships described by a third data table. For example, a sales

Use Quizgecko on...
Browser
Browser