Demystify Python Packages and Modules

Image for post
Image for post
mPhoto by Snowscat on Unsplash

While I was developing a python scraping tool for e-commerce not long ago, I was looking into better organising my directories, and found out that I need to understand Python modules and packages a bit more.

Let's look at the module first.

A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended. Within a module, the module’s name is available as the value of the global variable __name__.

So you can see python modules are just normal python files. There is differences though when it comes to executing modules as scripts vs. as imported modules.

# fib.pyprint "run as main"

For example, to execute modules as script, you can run with:

python fibo.py 

The interpreter will assign the hard-coded string "__main__" to the __name__ variable. And you will see the result on console:

run as main

But if your module is not the main program but imported module only, then you will find that your program won’t run. Why? because then __name__ will be "fibs", not "__main__", and it'll skip the executable statement. To enable your module can be both run as a main program and an imported module, add the if statement:

if __name__ == '__main__':
# do the thing

When a module is imported, the interpreter searches it in a list of directories from the following sources:

  • The current directory.
  • PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
  • An installation-dependent list of directories configured at the time Python is installed

The resulting search path is accessible in the Python variable sys.path:

>>> import sys
>>> sys.path
[‘’, ‘/Library/Frameworks/Python.framework/Versions/3.7/lib/python37.zip’, ‘/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7’, ‘/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/lib-dynload’, ‘/Users/Library/Python/3.7/lib/python/site-packages’, ‘/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages’]
>>>

You can obviously modify the above list to add your module at run-time if it’s not there:

>>> import sys
>>> sys.path.append('/user/lib/your-module')

All modules are singletons by nature because of Python’s module importing steps:

  1. Check whether a module is already imported.
  2. If yes, return it.
  3. If not, find a module, initialise it, and return it.

So the module will only be initialised the first time when it gets imported, the next time an initialised module will be returned. As a result, you may need to force reload the module if necessary.

module.py
a = "module loaded"
print('this is the first time:', a)
>>> import module
this is the first time: module loaded
>>> import module

So you can see only the first time the print action is performed. If you make a change to a module and need to reload it, you need to either restart the interpreter or use a function called reload() from module importlib:

>>> import module
this is the first time: module loaded
>>> import module
>>> import importlib
>>> importlib.reload(module)
this is the first time: module loaded

Packages are a way of structuring Python’s module namespace by using “dotted module names”. So a python package is essentially a folder containing one or more modules.

sound/                          Top-level package
__init__.py Initialize the sound package
formats/ Subpackage for file format
effects/ Subpackage for sound effects
__init__.py
echo.py
...
filters/ Subpackage for filters
__init__.py
equalizer.py
...

You can note that a __init__.py files are required to make Python treat directories as packages. This prevents directories with a common name, such as string, unintentionally hiding valid modules that occur later on the module search path. __init__.py can be an empty file, but can also contain initialization code for the package.

Users of the package can import individual modules from the package.

  • To import module using the full referenced name:
import sound.effects.echo
sound.effects.echo.echofilter(input, output, delay=0.7, atten=4)
  • To import module using from...import...
from sound.effects import echo
echo.echofilter(input, output, delay=0.7, atten=4)
  • To import the desired function/class directly using from...import...
from sound.effects.echo import echofilter
echofilter(input, output, delay=0.7, atten=4)

Note that when using from package import item, the item can be either a submodule (or subpackage) of the package, or a function, class or variable. The import statement first tests whether the item is defined in the package; if not, it assumes it is a module and attempts to load it. If it fails to find it, an ImportError exception is raised.

Contrarily, when using syntax like import item.subitem.subsubitem, each item except for the last must be a package; the last item can be a module or a package but can’t be a class or function or variable defined in the previous item.

To import a submodule within a parent module to another submodule, you can use either absolute import or relative import.

For example, to import the sound.filters.equalizer intosound.effects package, we can absolute importingfrom sound.effects import echo.

You can also use relative imports, with the from module import name form of import statement. These imports use leading dots to indicate the current and parent packages involved in the relative import.

from . import echo
from .. import formats
from ..filters import equalizer

Note that relative imports are based on the name of the current module. Since the name of the main module is always "__main__", modules intended for use as the main module of a Python application must always use absolute imports.

What does it mean?Consider the following structure:

mydir
- project
- __init__.py
- module1.py
- module2.py

I ended up in similar scenario and it troubled me a alot until I realised how module and package import is supposed to work.

Consider the following structure

mydir
- project
- __init__.py
- module1.py
- module2.py

module1.py

print("moudule1")

module2.py

from . import module1print("Module 2")if __name__ == '__main__':
print("Executed as script")

Let’s say you want to execute module2.py as a top-level script, by “top-level” I mean your current directory will be insideproject folder. You will get an error like ImportError: attempted relative import with no known parent package . That’s because when from . import module1 gets read, it cannot find what . is meant for, as you are not running it as part of a package, so package-level initialisation doesn’t run, and Python doesn’t even recognize the package’s existence. So you should always either use -m, or use a top-level entry point script that invokes the submodule's functionality if you want to run a submodule of a package directly.

Hope this is not too confusing!

Happy Reading!p

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store