10 Working with Files in Python.pptx
Document Details
Uploaded by PlentifulMonkey
Universidad Autónoma de Nuevo León
Tags
Related
Full Transcript
Working with Files in Python Overview of file handling in Python Text file Using the Storing data Using text characteristic print function on...
Working with Files in Python Overview of file handling in Python Text file Using the Storing data Using text characteristic print function on disk files s Displays Share data Store strings Extension.tx data to the with other in a text file t screen programs or Open text Contains colleagues file in only plain another text notebook Pure sequence of character codes Using the open function Using the open function Returns a file object Commonly used with two arguments Reading Mode Default mode for opening files Opens a file for reading Writing Mode Opens a file for writing Creates a new file if it does not exist Append Mode File Modes Opens a file in append mode Binary Mode Reading and Writing Mode Writing and Reading Mode Reading, Writing, and Appending Mode Writing to a File Opening a File Writing to the File Closing the File Object Object File name: "test.txt" Five lines written Ensures all data is saved Mode: "w" for writing Newline character ( ) at the end of each line Appending a File Changing the Mode to Append Switch the mode to 'a' instead of 'w' Allows adding content to the end of the file Writing to the File Similar to writing a file Use the 'write' method to append content Results in Figure 11.2 Shows the appended line at the end of the file Steps to read a file Read a file from disk Use a file handling method Store contents to a Assign file contents to a variable variable named 'content' Example file: test.txt Read the created test.txt file Reading file contents into a variable Store lines in a one-string Read file contents line by variable line Verify variable content to prove it's a Use f.readlines() to store lines in a list string Saving arrays to a text file using NumPy np.savetxt Function First argument: file name Second argument: object to save Third argument: format for output (e.g., "%.2f" for two decimals) Fourth argument: header for the file Working with Numerical Methods Often involves numbers or arrays Use methods to save/read arrays to/from files NumPy package is commonly used for this purpose Reading arrays from a text file using NumPy Using np.loadtxt function Reads file directly into an array Skips the first header Arguments for file reading Control how a file is read Check documentation for more details CSV Files CSV File Format Delimited text file using commas to separate values Stores large tables of data in plain text Each line represents one data record Introductio Fields in each record are separated by commas n to CSV Visualization files Can be opened using Microsoft Excel Visualizes rows and columns Python CSV Module Used to read and write CSV files NumPy Package Writing to a CSV file 1 2 3 Generating Saving Data to Opening CSV File Random Data CSV Can be opened using Used np.random Used np.savetxt Microsoft Excel function for 100 rows function Values separated by and 5 columns Set delimiter commas argument to "," Reading a CSV file Reading CSV File Use np.loadtxt function to read CSV file Specify delimiter to indicate data separation by commas Saving and Outputting Data Save CSV file to disk Output first 5 rows of the CSV file Opening CSV File Open CSV file using Microsoft Excel Open CSV file using a text editor Beyond NumPy NumPy for CSV Convenient for handling CSV files Files Pandas Popular for dealing with tabular data in Package Dataframe Explorin g Other openpyxl Package s Pickle for Data Storage Alternative to text or CSV files Stores dictionaries, tuples, lists, etc. Introductio Serialization Process n to Pickle Converts objects in memory to byte streams files Stores as binary files on disk Deserialization Process Loads binary files back to Python objects Pickle.dump function Takes two arguments: the object and a file object File object is returned by the Writing to open function a Pickle file Mode of open function Mode is 'wb' Indicates writing to a binary file Loading a Pickle File Use pickle.load function to load the file Mode of open function is 'rb' (read binary) Reading a Deserializes the binary file back to Pickle file the original object Advantages of Pickle Format Easy to store and load Python data structures No extra code needed to change the data structure JSON stands for JavaScript File extension is Object Notation “.json” Language- independe nt data Attractive to use Introductio format n to JSON Takes up files less space on disk Faster manipulation compared to pickle Good practice to store data Explore handling JSON using JSON files in Python JSON Representation Uses quoted strings for text JSON Contains value in key- format value pairs Structure Similarity Nearly identical to Python dictionary Writing a JSON file 01 02 03 Using json Creating and Serializing Library Saving JSON Files Objects json is natively Create a dictionary Use json.dump function supported by Python Save it to a JSON file on First argument: the Other libraries include the disk object simplejson, jyson, etc. Second argument: a file object Open function mode: 'w' for write Loading JSON Files Use json.load function to load JSON files JSON files can be saved on disk Reading a Similarities to Pickle JSON file JSON supports strings and numbers Supports nested lists, tuples, and objects Similar usage to pickle HDF5 for Large Data Storage Powerful binary data format with no file size limit Provides parallel IO and low-level optimizations HDF5 File Structure Introductio Contains datasets and groups Datasets are array-like collections of data n to HDF5 Groups are folder-like containers Attributes describe properties of datasets and files groups Hierarchical Nature Data saved like a file system with folder-like structures Groups operate like dictionaries with keys and values Using HDF5 in Python HDF5 Object Creation Created an HDF5 object for writing: station.hdf5 Data Storage in Groups Stored data into two top-level groups: acc and gps Each top-level group contains subgroups labeled 1 or 2 for Creating station names Each station contains a subgroup for storing array data and Adding Attributes reading an Attributes added to groups or data: dt, start_time, and location HDF5 file Folder-like Structure Data acc_1 saved at /acc/1/data File Closure Functions for Data Creation Reading HDF5 with Use h5py to read the HDF5 h5py file into hf_in Use keys function to see Viewing Groups in HDF5 groups in the HDF5 Reading an Accessing Group Get access to group members HDF5 file Members using hf_in["acc"] Specifying Path to Directly specify path to datasets as Datasets hf_in["acc/1/data"] Attributes associated with Accessing Attributes data can be accessed as a dictionary