Python Lunch & Learn

Session 2

Atul Saurav

Agenda:

  • Recap - Simple vs Complex Compound Data Types (Collections)
  • Compound Types
    • Ordered Collections / Sequences
      • list
      • string
      • tuple
    • Unordered Collections
      • Set
      • Dictionaries

Lists vs Tuples

  • Lists are more versatile than tuple, hence have overhead
  • lists are inclosed within [] where are tuples within ()
  • lists are mutable, tuples are not
  • whenever mutability is not important, prefer tuple
In [1]:
a = [1,2,3,4,5]
a[3] = 20
a
Out[1]:
[1, 2, 3, 20, 5]
In [3]:
a = (1,2,3,4,5)
a[3] = 20
a
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-3-89b1b6bece21> in <module>()
      1 a = (1,2,3,4,5)
----> 2 a[3] = 20
      3 a

TypeError: 'tuple' object does not support item assignment

Operations on Ordered Collections / Sequences

  • Membership tests
  • Concatenation
  • Repetition
  • Index
  • Slicing
  • Length
  • max, min item
  • Iteration (will be covered at the end of Unordered Collections)

Membership tests:

Test whether an item is present in the collection on not:

  • Evaluate to True if present
  • Evaluate to False in not present
In [4]:
't' in 'Python'
Out[4]:
True
In [5]:
3 in [1,2,3]
Out[5]:
True
In [6]:
[1,2] in ['abc',[1,2],'xyz']
Out[6]:
True
In [7]:
[1,2] in ['abc',[1,2, 3],'xyz']
Out[7]:
False

Concatenation

  • Similar to string concatenation
  • Can only concatenate similar types
In [8]:
'abc' + '123'
Out[8]:
'abc123'
In [9]:
['a', 'c'] + [1,2]
Out[9]:
['a', 'c', 1, 2]
In [10]:
(1,2,3) + (4,'five',6)
Out[10]:
(1, 2, 3, 4, 'five', 6)
In [11]:
'atul' + ['1','2']
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-308775c8301d> in <module>()
----> 1 'atul' + ['1','2']

TypeError: cannot concatenate 'str' and 'list' objects

Repetition

* operator is overridden on ordered collection types to interpret as 'times'

In [12]:
'Hello ' * 5 
Out[12]:
'Hello Hello Hello Hello Hello '
In [13]:
[1,2,3] * 3
Out[13]:
[1, 2, 3, 1, 2, 3, 1, 2, 3]
In [14]:
(6,'seven',7) * 2
Out[14]:
(6, 'seven', 7, 6, 'seven', 7)

Index

  • Provides random access to the elements of the ordered collection / sequence through index position
  • Offset starts with 0
  • Reverse indexing
    • Indexing can be performed from end to beginning as well
    • Starts with -1
In [15]:
lang = 'Python'
lang[1], lang[-1]
Out[15]:
('y', 'n')
In [16]:
even = [2,4,6,8,10]
even[3]
Out[16]:
8
In [17]:
row = ('123', 'Python', 'Active', '12/31/1999')
row[0], row[-2]
Out[17]:
('123', 'Active')

Slicing

slicing is similar to substring operation on sequences / ordered collections

  • syntax seq[ start : finish-1 : stride ]
  • if start is not provided, default is 0
  • if finish is not provided, default is -1
  • stride is hop and defaulted to 0
In [18]:
a = range(10)
a[:]
Out[18]:
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [19]:
a[3:7]
Out[19]:
[3, 4, 5, 6]
In [20]:
a[:7:2]
Out[20]:
[0, 2, 4, 6]
In [21]:
a[-1:-5]
Out[21]:
[]
In [22]:
a[-5:-1]
Out[22]:
[5, 6, 7, 8]
In [23]:
a[-1:-5:-1]
Out[23]:
[9, 8, 7, 6]

Length

returns the length of the sequence in terms of number of immediate children

In [24]:
a = [1,2,3,4,5]
len(a)
Out[24]:
5
In [25]:
b = tuple()
b
Out[25]:
()
In [26]:
len(b)
Out[26]:
0
In [27]:
a = [1,2,[3,4],5]
len(a)
Out[27]:
4

max, min item

  • Return the max or min object in the sequence.
  • If the objects are not simple types, then __lt__ needs to be defined on that class
In [28]:
b = [3,5,3,5,7,4,2,8,3,2,4]
max(b), min(b)
Out[28]:
(8, 2)
In [29]:
max('Python')
Out[29]:
'y'
In [30]:
langs = ['Python', 'Scala', 'Java', 'C++']
max(langs), min(langs)
Out[30]:
('Scala', 'C++')

sorted

Can sort sequence in ascending (default) or descending order. For non basic types, __lt__ needs to be defined, or key needs to be provided

In [31]:
sorted(lang)
Out[31]:
['P', 'h', 'n', 'o', 't', 'y']
In [32]:
sorted(lang, reverse=True)
Out[32]:
['y', 't', 'o', 'n', 'h', 'P']
In [33]:
sorted(langs)
Out[33]:
['C++', 'Java', 'Python', 'Scala']

Unordered Collection

  • there are times when the order of items is not important in the collection
  • need to store only unique items in your collection
  • need to store name/key value pairs

In above scenarios, Unordered collections viz. Sets (set) and Dictionaries (dict) are useful

Sets

  • Stores a unique collection of items by default in no specific order
  • is mutable, can add remove items
  • enclosed in {}
In [34]:
s = {1,2,3,4,5,3,2,4,6,72,4,6,3,2,46,'OK',2,2,4}
s
Out[34]:
{1, 2, 3, 4, 5, 6, 46, 72, 'OK'}
In [35]:
# Below is a dict
es = {}
type(es)
Out[35]:
dict
In [36]:
# this is how to initialize an empty set
es = set()
type(es)
Out[36]:
set
In [37]:
s.add('tomato')
s
Out[37]:
{1, 2, 3, 4, 5, 6, 46, 72, 'OK', 'tomato'}
In [38]:
s.remove(72)
s
Out[38]:
{1, 2, 3, 4, 5, 6, 46, 'OK', 'tomato'}
In [39]:
s.add(2)
s
Out[39]:
{1, 2, 3, 4, 5, 6, 46, 'OK', 'tomato'}

Dictionaries

In [40]:
a = {(1,2,'something'):'Amazing', 'id#': 342342432, 'id#':3}
In [41]:
a
Out[41]:
{'id#': 3, (1, 2, 'something'): 'Amazing'}
In [42]:
a['id#']
Out[42]:
3
In [43]:
a[(1,2, 'something')]
Out[43]:
'Amazing'
In [44]:
a[0]
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-44-5ccf417d7af1> in <module>()
----> 1 a[0]

KeyError: 0
In [45]:
'id#' in a
Out[45]:
True
In [46]:
if 'id#' in a:
    print 'Dups!!'
else:
    a['id#'] = 'Spmething else'
a
Dups!!
Out[46]:
{'id#': 3, (1, 2, 'something'): 'Amazing'}

Iteration

In [ ]:
for ele in coll:
    # do something
In [48]:
for x in 'Python':
    print x
P
y
t
h
o
n
In [49]:
for i,l in enumerate(langs, start=1):
    print i,l + '!'
1 Python!
2 Scala!
3 Java!
4 C++!
In [50]:
for i in range(len(langs)):
    print i, langs[i] + '!'
0 Python!
1 Scala!
2 Java!
3 C++!
In [51]:
for i in a:
    print a[i]
Amazing
3
In [52]:
l = ['c++', 'Java', 'Python', 'Scala']
sorted(l)
Out[52]:
['Java', 'Python', 'Scala', 'c++']