Python Iterable and Iterator

If you are like me who have a few months experience of Python, then you might be in the same position as me — know something about Python, but not enough. So I spent sometime to gather bits and pieces of scattered knowledge to dig deeper into Python. In this blog, the piece will be iterable and iterator.

But before we choose between iterable and iterator, what does iteration mean?

Iteration is a process of taking each item in a list one after another.

And now we are coming down to iterable and iterator.

An Iterable is an object capable of returning its members one at a time.

In practice, it is anything that can be looped over, like anything that can appear on the right-side of a for-loop: for x in iterable: ... :

  • all sequence types (such as list, str, and tuple);
  • some non-sequence types like dict, file objects;
  • any objects of any classes you define with an __iter__() method, and;
  • a __getitem__() method that implements Sequence semantics, suitable for indexed lookup

Sequence: An iterable which supports efficient element access using integer indices via the __getitem__() special method and defines a __len__() method that returns the length of the sequence. Some built-in sequence types are list, str, tuple, and bytes.

An Iterator is an object representing a stream of data.

  • It complies with the iterator protocol which consists of two special methods __iter__() (or __getitem__)and __next__() , with state that remembers where it is during iteration;
  • The _iter__() method returns an iterator object;
  • The __next__ method returns the next value in the iteration, and updates the state to point to the next value in the ray, and raisesStopIteration when no value is there;

So what is the relationship between an iterator and an iterable?

An iterable is an object that has an __iter__ method which returns an iterator.

Note:

  • Every iterator is also an iterable, but not every iterable is an iterator;
  • An iterator is self-iterable , as by calling __iter__ method it returns self;
  • An iterator 's__next__ method in Python3, but next in Python2;

Let’s look at some examples with data types that support iterators.

# sequence --string
>>>
a = "hello world"
iter_a = iter(a)
print(iter_a)
print(type(iter_a))
for _ in iter_a:
print(_)
============
<str_iterator object at 0x7fdc01a4d940>
<class 'str_iterator'>
h
e
l
l
o

w
o
r
l
d
# sequence --list
>>>
a =["hello", "world"]
iter_a = iter(a)
print(iter_a)
print(type(iter_a))
for _ in iter_a:
print(_)
============
<list_iterator object at 0x7fa0586e4940>
<class 'list_iterator'>
hello
world
# sequence --tuple
>>>
a =("hello", "world")
iter_a = iter(a)
print(iter_a)
print(type(iter_a))
for _ in iter_a:
print(_)
============
<tuple_iterator object at 0x7f4a288c0940>
<class 'tuple_iterator'>
hello
world
# unordered sequence --set
>>>
a ={"hello", "world"}
iter_a = iter(a)
print(iter_a)
print(type(iter_a))
for _ in iter_a:
print(_)
============
<set_iterator object at 0x7f16c4069b40>
<class 'set_iterator'>
world
hello
# mapping --dict
>>> 1) key

a = {"first": "hello", "second":"world"}
iter_a = iter(a)
print(iter_a)
print(type(iter_a))
for _ in iter_a:
print(_)
============
<dict_keyiterator object at 0x7f2cd98997c0>
<class 'dict_keyiterator'>
first
second
>>> 2) value
a = {"first": "hello", "second":"world"}
iter_a = iter(a.values())
print(iter_a)
print(type(iter_a))
for _ in iter_a:
print(_)
============
<dict_valueiterator object at 0x7f90c68f5860>
<class 'dict_valueiterator'>
hello
world
>>> 3) key- valvalue
a = {"first": "hello", "second":"world"}
iter_a = iter(a.items())
print(iter_a)
print(type(iter_a))
for _ in iter_a:
print(_)
============
<dict_itemiterator object at 0x7f9c78f26810>
<class 'dict_itemiterator'>
('first', 'hello')
('second', 'world')
# file object
>>>
with open ("helloworld.txt", "rt") as f:
print(f.readline())
============txt & console
hello
world

Now, after getting some taste of iterator, let’s dig into the iter() syntax.

iter(object,[sentinel])

Python iter() function takes two parameters as arguments.

  • object (required) — An iterable from which an iterator is to be created.
  • sentinel (optional) —A singleton object that represents some terminating condition. When present, the other argument must be callable.

So you can see python provides two ways of constructing first iterator object argument, depending on with or without the sentinel.

  • Without sentinel: the first object has to comply with iteration protocol ( __iter__() function) or sequence protocol ( __getitem__() method ).
    If it doesn’t support either of these protocols in the absence of sentinel, TypeError is raised.
  • With sentinel:
    the first argument object must be callable type. The iterator created calls the object with no argument for each call to its __next__() method. This continues till the value returned is equal to the sentinel. When the value is equal to sentinel, StopIteration will be raised.

Note the mentioning of __getitem__() , this is a historical issue:

The __getitem__ is a legacy implementation before Python had modern iteration protocol. The cons of the __getitem__ is that it has more than needed to support iteration: it allows going forwards and backwards, which requires random access, and this can be costly with reading files or network streams. In comparison, __iter__ allows iteration without random access.

But since __getitem__ was supported in the first place, any object implementing __getitem__ will for sure support iteration protocol (thinking any sequence would be automatically iterable until IndexError or StopIteration is raised).

For example, in Python 2.7 string as a sequence are iterable by using __getitem__ . But there is __getitem__ on objects without index like dicts and sets.

So the process for getting the object is:

  • Check __iter__ method. If it exists, use the new iteration protocol.
  • Otherwise, check __getitem__ until it raises IndexError.

Let’s see a few examples to implement the iter() :

# without sentinel using __getitem__
>>>
class A:
def __getitem__(self, index):
if index >= 10:
raise IndexError
return index * 10
list(A())
==========================
[0, 10, 20, 30, 40, 50, 60, 70, 80, 90]
# without sentinel using __iter__
>>>
class A:
def __iter__(self):
self.n = 0
return self

def __next__(self):
if self.n <= 10:
result = 10 * self.n
self.n += 1
return result
else:
raise StopIteration
list(A())
==================
[0, 10, 20, 30, 40, 50, 60, 70, 80, 90]
# without sentinel using __iter__ (generator ver)
>>> class A:
def __iter__(self):
yield 10
yield 20
yield 30

list(A())
===============
[10, 20, 30]
# with sentinel
>>> with open('example.txt', "rt") as f:
for line in iter(lambda: f.readline().strip(), 'END'):
print line
===============example.txt
10
20
30
END
40
50
==============console
10
20
30

That’s so much of it!

Happy Reading!

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store