Python en:Modules

Introduction
You have seen how you can reuse code in your program by defining functions once. What if you wanted to reuse a number of functions in other programs that you write? As you might have guessed, the answer is modules.

There are various methods of writing modules, but the simplest way is to create a file with a .py extension that contains functions and variables.

Another method is to write the modules in the native language in which the Python interpreter itself was written. For example, you can write modules in the C programming language and when compiled, they can be used from your Python code when using the standard Python interpreter.

A module can be imported by another program to make use of its functionality. This is how we can use the Python standard library as well. First, we will see how to use the standard library modules.

Example:

Output:

$ python using_sys.py we are arguments The command line arguments are: using_sys.py   we    are arguments The PYTHONPATH is [ '', 'C:\\Windows\\system32\\python30.zip', 'C:\\Python30\\DLLs', 'C:\\Python30\\lib', 'C:\\Python30\\lib\\plat-win', 'C:\\Python30', 'C:\\Python30\\lib\\site-packages']

How It Works:

First, we import the sys module using the import statement. Basically, this translates to us telling Python that we want to use this module. The sys module contains functionality related to the Python interpreter and its environment i.e. the system.

When Python executes the import sys statement, it looks for the sys module. In this case, it is one of the built-in modules, and hence Python knows where to find it.

If it was not a compiled module i.e. a module written in Python, then the Python interpreter will search for it in the directories listed in its sys.path variable. If the module is found, then the statements in the body of that module is run and then the module is made available for you to use. Note that the initialization is done only the first time that we import a module.

The argv variable in the sys</tt> module is accessed using the dotted notation i.e. sys.argv</tt>. It clearly indicates that this name is part of the sys</tt> module. Another advantage of this approach is that the name does not clash with any argv</tt> variable used in your program.

The sys.argv</tt> variable is a list of strings (lists are explained in detail in a later chapter. Specifically, the sys.argv</tt> contains the list of command line arguments i.e. the arguments passed to your program using the command line.

If you are using an IDE to write and run these programs, look for a way to specify command line arguments to the program in the menus.

Here, when we execute python using_sys.py we are arguments</tt>, we run the module using_sys.py</tt> with the python</tt> command and the other things that follow are arguments passed to the program. Python stores the command line arguments in the sys.argv</tt> variable for us to use.

Remember, the name of the script running is always the first argument in the sys.argv</tt> list. So, in this case we will have 'using_sys.py'</tt> as sys.argv[0]</tt>, 'we'</tt> as sys.argv[1]</tt>, 'are'</tt> as <tt>sys.argv[2]</tt> and <tt>'arguments'</tt> as <tt>sys.argv[3]</tt>. Notice that Python starts counting from 0 and not 1.

The <tt>sys.path</tt> contains the list of directory names where modules are imported from. Observe that the first string in <tt>sys.path</tt> is empty - this empty string indicates that the current directory is also part of the <tt>sys.path</tt> which is same as the <tt>PYTHONPATH</tt> environment variable. This means that you can directly import modules located in the current directory. Otherwise, you will have to place your module in one of the directories listed in <tt>sys.path</tt>.

Note that the current directory is the directory from which the program is launched. Run <tt>import os; print(os.getcwd)</tt> to find out the current directory of your program.

Byte-compiled .pyc files
Importing a module is a relatively costly affair, so Python does some tricks to make it faster. One way is to create byte-compiled files with the extension <tt>.pyc</tt> which is an intermediate form that Python transforms the program into (remember the introduction section on how Python works?). This <tt>.pyc</tt> file is useful when you import the module the next time from a different program - it will be much faster since a portion of the processing required in importing a module is already done. Also, these byte-compiled files are platform-independent.


 * Note
 * These <tt>.pyc</tt> files are usually created in the same directory as the corresponding <tt>.py</tt> files. If Python does not have permission to write to files in that directory, then the <tt>.pyc</tt> files will not be created.

The from ... import ... statement
If you want to directly import the <tt>argv</tt> variable into your program (to avoid typing the <tt>sys.</tt> everytime for it), then you can use the <tt>from sys import argv</tt> statement. If you want to import all the names used in the <tt>sys</tt> module, then you can use the <tt>from sys import *</tt> statement. This works for any module.

In general, you should avoid using this statement and use the <tt>import</tt> statement instead since your program will avoid name clashes and will be more readable.

A module's __name__
Every module has a name and statements in a module can find out the name of its module. This is handy in the particular situation of figuring out if the module is being run standalone or being imported. As mentioned previously, when a module is imported for the first time, the code in that module is executed. We can use this concept to alter the behavior of the module if the program was used by itself and not when it was imported from another module. This can be achieved using the <tt>__name__</tt> attribute of the module.

Example:

Output:

$ python using_name.py   This program is being run by itself $ python >>> import using_name I am being imported from another module >>>

How It Works:

Every Python module has it's <tt>__name__</tt> defined and if this is <tt>'__main__'</tt>, it implies that the module is being run standalone by the user and we can take appropriate actions.

Making Your Own Modules
Creating your own modules is easy, you've been doing it all along! This is because every Python program is also a module. You just have to make sure it has a <tt>.py</tt> extension. The following example should make it clear.

Example:

The above was a sample module. As you can see, there is nothing particularly special about compared to our usual Python program. We will next see how to use this module in our other Python programs.

Remember that the module should be placed in the same directory as the program that we import it in, or the module should be in one of the directories listed in <tt>sys.path</tt>.

Output:

$ python mymodule_demo.py   Hi, this is mymodule speaking. Version 0.1

How It Works:

Notice that we use the same dotted notation to access members of the module. Python makes good reuse of the same notation to give the distinctive 'Pythonic' feel to it so that we don't have to keep learning new ways to do things.

Here is a version utilising the <tt>from..import</tt> syntax:

The output of <tt>mymodule_demo2.py</tt> is same as the output of <tt>mymodule_demo.py</tt>.

Notice that if there was already a <tt>__version__</tt> name declared in the module that imports mymodule, there would be a clash. This is also likely because it is common practice for each module to declare it's version number using this name. Hence, it is always recommended to prefer the <tt>import</tt> statement even though it might make your program a little longer.

You could also use:

This will import all public names such as <tt>sayhi</tt> but would not import <tt>__version__</tt> because it starts with double underscores.


 * Zen of Python
 * One of Python's guiding principles is that "Explicit is better than Implicit". Run <tt>import this</tt> to learn more and see this discussion which lists examples for each of the principles.

The dir function
You can use the built-in <tt>dir</tt> function to list the identifiers that an object defines. For example, for a module, the identifiers include the functions, classes and variables defined in that module.

When you supply a module name to the <tt>dir</tt> function, it returns the list of the names defined in that module. When no argument is applied to it, it returns the list of names defined in the current module.

Example:

How It Works:

First, we see the usage of <tt>dir</tt> on the imported <tt>sys</tt> module. We can see the huge list of attributes that it contains.

Next, we use the <tt>dir</tt> function without passing parameters to it. By default, it returns the list of attributes for the current module. Notice that the list of imported modules is also part of this list.

In order to observe the <tt>dir</tt> in action, we define a new variable <tt>a</tt> and assign it a value and then check <tt>dir</tt> and we observe that there is an additional value in the list of the same name. We remove the variable/attribute of the current module using the <tt>del</tt> statement and the change is reflected again in the output of the <tt>dir</tt> function.

A note on <tt>del</tt> - this statement is used to delete a variable/name and after the statement has run, in this case <tt>del a</tt>, you can no longer access the variable <tt>a</tt> - it is as if it never existed before at all.

Note that the <tt>dir</tt> function works on any object. For example, run <tt>dir(print)</tt> to learn about the attributes of the print function, or <tt>dir(str)</tt> for the attributes of the str class.

Packages
By now, you must have started observing the hierarchy of organizing your programs. Variables usually go inside functions. Functions and global variables usually go inside modules. What if you wanted to organize modules? That's where packages come into the picture.

Packages are just folders of modules with a special <tt>__init__.py</tt> file that indicates to Python that this folder is special because it contains Python modules.

Let's say you want to create a package called 'world' with subpackages 'asia', 'africa', etc. and these subpackages in turn contain modules like 'india', 'madagascar', etc.

This is how you would structure the folders:

- <some folder present in the sys.path>/ - world/ - __init__.py           - asia/ - __init__.py               - india/ - __init__.py                   - foo.py            - africa/ - __init__.py               - madagascar/ - __init__.py                   - bar.py

Packages are just a convenience to hierarchically organize modules. You will see many instances of this in the standard library.

Summary
Just like functions are reusable parts of programs, modules are reusable programs. Packages are another hierarchy to organize modules. The standard library that comes with Python is an example of such a set of packages and modules.

We have seen how to use these modules and create our own modules.

Next, we will learn about some interesting concepts called data structures.

Previous Next