Programmer's Python Data - Files and Paths
Written by Mike James   
Monday, 17 February 2025
Article Index
Programmer's Python Data - Files and Paths
Opening Files
Path

There are methods to access the different parts of a Path and the most useful are:

  • Path.parts()

returns a tuple of all of the different parts of a path, i.e. it undoes the constructor

  • Path.drive

is the drive letter if there is one

  • Path.root

the root of the path

  • Path,anchor 

the drive and root

  • Path.parent

the path to the immediate parent of the final directory, i.e. the same as ..

  • Path.parents

an iterator for each directory moving up the path to the root.

  • Path.name

the final path component – e.g. the file name

  • Path,suffix

the file extension

  • Path.suffixes

a list of the file extensions

  • Path.stem

the path without the final extension.

There are also methods that will let you manipulate paths and discover the paths of standard directories. Some of the more notable are:

  • Path.cwd()

method returns a path object representing the current directory.

  • Path.home()

returns the user’s home directory.

  • Path.exists() 

is true if the path – file or directory – exists.

  • Path.mkdir(mode=511, parents=False, exist_ok=False)

creates the new directory at the specified path where mode is the access permission. If parents = True any missing directories on the path are also created. if exists_ok = True then no exception is generated as long as the directory exists.

  • Path.rmdir()

removes the specified directory which must be empty.

  • Path.rename(target)

changes the name of the file or directory specified by Path to target and returns a new path object referencing it.

  • Path.iterdir()

returns an iterator for contents of the directory.

  • Path.relative_to(path)

returns a new path relative to the path specified. If this cannot be done you raise a ValueError.

To change individual components of the path you can use:

Path.with_stem(new)
Path.with_name(new)
Path.with_suffix(new)

to change the specified part of the path. A new path is returned.

There are lots of other methods and many replace methods in the os module.

In addition the Path object provides a direct open method:

Path.open(mode='r', buffering=- 1, encoding=None, 
                               errors=None, newline=None)

which simply calls the built-in open function and is a more object-oriented way of doing the job.

File Modes

Now that we know how to specify a file using a path object, we can return to examine the different modes that we can use to open a file.

The mode can be set to:

 

Character

Meaning

r

open for reading (default)

w

open for writing, truncating the file first

x

open for exclusive creation, failing if the file already exists

a

open for writing, appending to the end of file if it exists

b

binary mode

t

text mode (default)

+

open for updating (reading and writing)

r+ positions file at start

w+ overwrites file if it exists

 

Some of these are obvious, but there are a few things to notice. The first is that the file that you are trying to open for reading has to exist. If you open a file for writing and it exists then it is essentially deleted and recreated as an empty file ready for you to write to. If you simply want to add to the end of an existing file then use a, open for append. If you want an open for writing to fail if the file already exists then specify x. You can open a file for both reading and writing using + and exactly how this works is explained later.

The two key modes are t for text and b for binary. The difference is very simple, but potentially very confusing. A binary file is written and read in terms of bytes which you can interpret in any way that you like. A text file is a binary file where the bytes are automatically interpreted as text using some encoding.

The open function returns a different io object depending on the mode used to open the file as the file object. For example, if you open the file for writing in binary mode, i.e. wb, then you get a BufferedWriter object. If you open the file for reading in binary mode, i.e. b then you get a BufferedReader object. If you open a binary file for reading and writing then you get a BufferedRandom object. If you open in text mode then you get a TextIOWrapper object. In most cases this doesn’t matter, but if you need to know more about the lower-level classes see the documentation for the io module.

It is generally said that text mode is easier but binary is more fundamental and once you understand it then text mode is trivial.

In chapter but not included in this extract

  • Binary Mode
  • Encoding To Binary
  • Struct and Binary Files
  • Records and Dataclass
  • Positioning – Random Access Files
  • Buffers
  • Context Managers and With

Summary

  • It is a principle of Unix/Linux that nearly everything is a file. Files are the abstraction most often used to characterize the transfer of data out of or into the machine or between two parts of the machine.

  • A simple sequential file is very similar to a FIFO stack or queue. By default writes are to the end of the file and reads are from the start.

  • Files have to be opened to be used and closed when you have finished with them.

  • To specify where a file is to be found we generally use a path, which includes a file name. Paths are more easily worked with via the pathlib module.

  • Files can be opened for reading or writing and in either text or binary mode. Binary mode is the more fundamental in that a text mode file is simply a binary file that is treated as representing text in some encoding.

  • The problem with using any file is knowing how the data is stored and hence how many bytes to read to obtain an item of data.

  • To write data to a binary file you have to convert the data to a representation as a byte sequence and you can do this most easily using the struct module.

  • File are often structured using repeats of the same record format.

  • In Python records can be represented using tuples, named tuples of most often a dataclass.

  • You can easily read and write a dataclass to a binary file using an associated Struct.

  • You can use seek and tell to position the file location in a random access file.

  • Creating a random access record file is easy as long as the records are of fixed size. If you want to do the same thing with variable length records then you need to use an index.

  • Buffers are often used to speed up the transfer of data between devices. The only time you need to worry about buffers is when the data is being transferred in real time.

  • The context manager is a simple mechanism for implementing resource construction and destruction steps. It is a simple way of making sure that a file is always closed when you have finished with it.

Programmer's Python
Everything is Data

Is now available as a print book: Amazon

pythondata360Contents

  1. Python – A Lightning Tour
  2. The Basic Data Type – Numbers
       Extract: Bignum
  3. Truthy & Falsey
  4. Dates & Times
       Extract Naive Dates
  5. Sequences, Lists & Tuples
       Extract Sequences 
  6. Strings
       Extract Unicode Strings
  7. Regular Expressions
       Extract Simple Regular Expressions 
  8. The Dictionary
       Extract The Dictionary 
  9. Iterables, Sets & Generators
       Extract  Iterables 
  10. Comprehensions
       Extract  Comprehensions 
  11. Data Structures & Collections
       Extract Stacks, Queues and Deques
      
    Extract Named Tuples and Counters
  12. Bits & Bit Manipulation
       Extract Bits and BigNum 
  13. Bytes
       Extract Bytes And Strings
       Extract Byte Manipulation 
  14. Binary Files
       Extract Files and Paths ***NEW!!!
  15. Text Files
  16. Creating Custom Data Classes
        Extract A Custom Data Class 
  17. Python and Native Code
        Extract   Native Code
    Appendix I Python in Visual Studio Code
    Appendix II C Programming Using Visual Studio Code

<ASIN:1871962765>

<ASIN:1871962749>

<ASIN:1871962595>

<ASIN:B0CK71TQ17>

<ASIN:187196265X>

Related Articles

Creating The Python UI With Tkinter

Creating The Python UI With Tkinter - The Canvas Widget

The Python Dictionary

Arrays in Python

Advanced Python Arrays - Introducing NumPy

espbook

 

Comments




or email your comment to: comments@i-programmer.info

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

Banner



Last Updated ( Monday, 17 February 2025 )