OOP in Python

Object Class
5 int
"hi" str
np.mean .function
import numpy as np
a = np.array([1,2,3,4])
# shape attribute
a.shape

# shape method
a.min()
1
dir(a)
['T',
 '__abs__',
 '__add__',
 '__and__',
 '__array__',
 '__array_finalize__',
 '__array_function__',
 '__array_interface__',
 '__array_prepare__',
 '__array_priority__',
 '__array_struct__',
 '__array_ufunc__',
 '__array_wrap__',
 '__bool__',
 '__class__',
 '__class_getitem__',
 '__complex__',
 '__contains__',
 '__copy__',
 '__deepcopy__',
 '__delattr__',
 '__delitem__',
 '__dir__',
 '__divmod__',
 '__dlpack__',
 '__dlpack_device__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__gt__',
 '__hash__',
 '__iadd__',
 '__iand__',
 '__ifloordiv__',
 '__ilshift__',
 '__imatmul__',
 '__imod__',
 '__imul__',
 '__index__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__ior__',
 '__ipow__',
 '__irshift__',
 '__isub__',
 '__iter__',
 '__itruediv__',
 '__ixor__',
 '__le__',
 '__len__',
 '__lshift__',
 '__lt__',
 '__matmul__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__or__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rand__',
 '__rdivmod__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rfloordiv__',
 '__rlshift__',
 '__rmatmul__',
 '__rmod__',
 '__rmul__',
 '__ror__',
 '__rpow__',
 '__rrshift__',
 '__rshift__',
 '__rsub__',
 '__rtruediv__',
 '__rxor__',
 '__setattr__',
 '__setitem__',
 '__setstate__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__xor__',
 'all',
 'any',
 'argmax',
 'argmin',
 'argpartition',
 'argsort',
 'astype',
 'base',
 'byteswap',
 'choose',
 'clip',
 'compress',
 'conj',
 'conjugate',
 'copy',
 'ctypes',
 'cumprod',
 'cumsum',
 'data',
 'diagonal',
 'dot',
 'dtype',
 'dump',
 'dumps',
 'fill',
 'flags',
 'flat',
 'flatten',
 'getfield',
 'imag',
 'item',
 'itemset',
 'itemsize',
 'max',
 'mean',
 'min',
 'nbytes',
 'ndim',
 'newbyteorder',
 'nonzero',
 'partition',
 'prod',
 'ptp',
 'put',
 'ravel',
 'real',
 'repeat',
 'reshape',
 'resize',
 'round',
 'searchsorted',
 'setfield',
 'setflags',
 'shape',
 'size',
 'sort',
 'squeeze',
 'std',
 'strides',
 'sum',
 'swapaxes',
 'take',
 'tobytes',
 'tofile',
 'tolist',
 'tostring',
 'trace',
 'transpose',
 'var',
 'view']

example

You can use help() to explore an unfamiliar object. Notice how descriptive names of attributes and methods, together with the methods’ docstrings, helped you figure out class functionality even when you didn’t know how it was implemented. Keep this in mind when writing your own classes

help(mystery)
Help on Employee in module __main__ object:

class Employee(builtins.object)
 |  Employee(name, email=None, salary=None, rank=5)
 |  
 |  Class representing a company employee.
 |  
 |  Attributes
 |   ----------
 |   name : str 
 |       Employee's name        
 |   email : str, default None
 |       Employee's email
 |   salary : float, default None
 |       Employee's salary
 |   rank : int, default 5
 |       The rank of the employee in the company hierarchy (1 -- CEO, 2 -- direct reports of CEO, 3 -- direct reports of direct reports of CEO etc). Cannot be None if the employee is current.
 |  
 |  Methods defined here:
 |  
 |  __init__(self, name, email=None, salary=None, rank=5)
 |      Create an Employee object
 |  
 |  give_raise(self, amount)
 |      Raise employee's salary by a certain `amount`. Can only be used with current employees.
 |      
 |      Example usage:
 |        # emp is an Employee object
 |        emp.give_raise(1000)
 |  
 |  promote(self)
 |      Promote an employee to the next level of the company hierarchy. Decreases the rank of the employee by 1. Can only be used on current employeed who are not at the top of the hierarchy.
 |      
 |      Example usage:
 |          # emp is an Employee object
 |          emp.promote()
 |  
 |  terminate(self)
 |      Terminate the employee. Sets salary and rank to None..
 |      
 |      Example usage:
 |         # emp is an Employee object
 |         emp.terminate()
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
 |  
 |  __weakref__
 |      list of weak references to the object (if defined)
# Print the mystery employee's name
print(mystery.name)

# Print the mystery employee's salary
print(mystery.salary)

# Give the mystery employee a raise of $2500
mystery.give_raise(2500)

# Print the salary again
print(mystery.salary)

Creating a Class

  • To start a new class definition:
class Customer:
    # code for class is indented in this block

Can create a blank/empty class, like a template, by passing the pass statement:

class Customer:
    pass

Even though its empty, we can already create objects of the class by specifying the name of the class, followed by parentheses

# two objects of class Customer
c1 = Customer()
c2 = Customer()

We want a class to have attributes (store data) and methods (operate on it) - Methods = functions, so defining a method = defining a python function (within the class) - One exception, the special self argument that every method will have as 1st argument

class Customer:
    # method "identify"
    def identify(self, name):
        print("I am Customer " + name)

# Create a customer
cust = Customer()
cust.identify("Laura")
I am Customer Laura
  • Ignore self when calling a method on an object (e.g. we just needed the customer’s name)

  • Classes are templates, objects of a class don’t yet exist when a class is being defined, but we need a way to refer to data of particular object in a class

    • self is a standin for the future object; so we can use it to call attributes and other methods from within the class definition even when no objects are created yet
    • Python takes care of self when a method is called on an object: cust.identify("Laura") is intepretted as cust.identify(cust, "Laura") (note the object itself is the argument)
  • Attributes, like variables, are created by assignment = in

    • an attribute manifests into existence only when value assigned to it
class Customer:
    # set the name attribute of an object to new_name
    def set_name(self, new_name):
        # create an attribute by assigning a value
        self.name = new_name # will create .name when set_name is called
cust = Customer() # .name doesn't exist here yet
cust.set_name("lara") # .name is created and set to lara
print(cust.name)
lara

Now instead of passing name as a parameter (as before), we will use the data already stored in the name attribute to write a better identify methiod:

class Customer:
    def set_name(self, new_name):
        self.name = new_name 
    
    # Using .name from object it*self*
    def identify(self):
        print("I am Customer " + self.name)
class MyCounter:
    def set_count(self, n):
        self.count = n

mc = MyCounter()
mc.set_count(5)

mc.count = mc.count + 1
print(mc.count)
6

Well done! Notice how you used self.count to refer to the count attribute inside a class definition, and mc.count to refer to the count attribute of an object. Make sure you understand the difference, and when to use which form (review the video if necessary)!

# Create an empty class Employee
class Employee:
    pass


# Create an object emp of class Employee 
emp = Employee()
# Include a set_name method
class Employee:
  
  def set_name(self, new_name):
    self.name = new_name
  
# Create an object emp of class Employee  
emp = Employee()

# Use set_name() on emp to set the name of emp to 'Korel Rossi'
emp.set_name('Korel Rossi')

# Print the name of emp
print(emp.name)
Korel Rossi
class Employee:
  
  def set_name(self, new_name):
    self.name = new_name
  
  # Add set_salary() method
  def set_salary(self, new_salary):
    self.salary = new_salary
  
# Create an object emp of class Employee  
emp = Employee()

# Use set_name to set the name of emp to 'Korel Rossi'
emp.set_name('Korel Rossi')

# Set the salary of emp to 50000
emp.set_salary(50000)

Fantastic! You created your first class with two methods and two attributes. Try running dir(emp) in the console and see if you can find where these attributes and methods pop up!

In [2]:
dir(emp)
Out[2]:

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'name',
 'salary',
 'set_name',
 'set_salary']
class Employee:
    def set_name(self, new_name):
        self.name = new_name

    def set_salary(self, new_salary):
        self.salary = new_salary 
  
emp = Employee()
emp.set_name('Korel Rossi')
emp.set_salary(50000)

# Print the salary attribute of emp
print(emp.salary)

# Increase salary of emp by 1500
emp.salary = emp.salary + 1500

# Print the salary attribute of emp again
print(emp.salary)
50000
51500

Now make a new method

class Employee:
    def set_name(self, new_name):
        self.name = new_name

    def set_salary(self, new_salary):
        self.salary = new_salary 

    # Add a give_raise() method with raise amount as a parameter
    def give_raise(self, amount):
        self.salary = self.salary + amount


emp = Employee()
emp.set_name('Korel Rossi')
emp.set_salary(50000)

print(emp.salary)
emp.give_raise(1500)
print(emp.salary)
50000
51500
class Employee:
    def set_name(self, new_name):
        self.name = new_name

    def set_salary(self, new_salary):
        self.salary = new_salary 

    def give_raise(self, amount):
        self.salary = self.salary + amount

    # Add monthly_salary method that returns 1/12th of salary attribute
    def monthly_salary(self):
        return self.salary / 12
    
emp = Employee()
emp.set_name('Korel Rossi')
emp.set_salary(50000)

# Get monthly salary of emp and assign to mon_sal
mon_sal = emp.monthly_salary()

# Print mon_sal
print(mon_sal)
4166.666666666667

You are doing great! You might be wondering: why did we write these methods when all the same operations could have been performed on object attributes directly? Our code was very simple, but methods that deal only with attribute values often have pre-processing and checks built in: for example, maybe the company has a maximal allowable raise amount. Then it would be prudent to add a clause to the give_raise() method that checks whether the raise amount is within limits.

constructor

A better strategy than the above would be to add data to an object when creating it (rather than creating attributes from methods one at a time), like when creating a numpy array or pandas dataframe - A special method called the constructor, __init__()which is automatically called each time an object is created

class Customer:
    def __init__(self, name):
        self.name = name       # create .name attribute and set it to new parameter
        print("The __init__ method was called")


cust = Customer("Lara")         # __init__ is implicitly called
print(cust)
The __init__ method was called
<__main__.Customer object at 0x1033e7790>

Now can pass customer’s name in () when creating the customer object, which automatically calls __init__

Create another attribute, balance

class Customer:
    def __init__(self, name, balance):      # <-- balance parameter added
        self.name = name       
        self.balance = balance              # <-- balance attribute added
        print("The __init__ method was called")

cust = Customer("Lara", 1000)               # <-- __init__ is called
print(cust.name)
print(cust.balance)
The __init__ method was called
Lara
1000

The __init__ constructor is also a good place to store default values for attributes, so an attribute parameter doesn’t have to be set upon creation of an object

class Customer:
    def __init__(self, name, balance = 0):      # <-- default balance
        self.name = name       
        self.balance = balance              
        print("The __init__ method was called")

cust = Customer("Lara")               # <-- don't have to specify balance
print(cust.name)
print(cust.balance)
The __init__ method was called
Lara
0

So, can define attributes in methods (called after class created), or defined in the constructor (better)

Best practices - Initialize attributes in __init__() - Name classes with CamelCase, name functions and atrributes with lower_snake_case - keep self as self - in truth, the name self is a convention, could write

   class MyClass:
    # this works but isn't recommended
    def my_method(kitty, attr):
        kitty.attr = attr
- classes, like functions, allow for docstrings which are displayed when `help()` is called on an object
class MyClass:
    """This class does nothing"""
    pass

In this exercise, you’ll continue working on the Employee class. Instead of using the methods like set_salary() that you wrote in the previous lesson, you will introduce a constructor that assigns name and salary to the employee at the moment when the object is created.

You’ll also create a new attribute – hire_date – which will not be initialized through parameters, but instead will contain the current date.

Initializing attributes in the constructor is a good idea, because this ensures that the object has all the necessary attributes the moment it is created.

class Employee:
    # Create __init__() method
    def __init__(self, name, salary = 0):
        # Create the name and salary attributes
        self.name = name
        self.salary = salary
    
    # From the previous lesson
    def give_raise(self, amount):
        self.salary += amount

    def monthly_salary(self):
        return self.salary/12
        
emp = Employee("Korel Rossi")
print(emp.name)
print(emp.salary)     
Korel Rossi
0

The init() method is a great place to do preprocessing.

Modify __init__() to check whether the salary parameter is positive:
    if yes, assign it to the salary attribute,
    if not, assign 0 to the attribute and print "Invalid salary!".
class Employee:
  
    def __init__(self, name, salary=0):
        self.name = name
        # Modify code below to check if salary is positive
        if salary >= 0:
            self.salary = salary
        else:
            self.salary = 0
            print("Invalid salary!")
   
   # ...Other methods omitted for brevity ...
      
emp = Employee("Korel Rossi", -1000)
print(emp.name)
print(emp.salary)
Invalid salary!
Korel Rossi
0
Import datetime from the datetime module. This contains the function that returns current date.
Add an attribute hire_date and set it to datetime.today().
# Import datetime from datetime
from datetime import datetime

class Employee:
    
    def __init__(self, name, salary=0):
        self.name = name
        if salary > 0:
          self.salary = salary
        else:
          self.salary = 0
          print("Invalid salary!")
          
        # Add the hire_date attribute and set it to today's date
        self.hire_date = datetime.today()
        
   # ...Other methods omitted for brevity ...
      
emp = Employee("Korel Rossi", -1000)
print(emp.name)
print(emp.salary)
Invalid salary!
Korel Rossi
0

You’re doing great! Notice how you had to add the import statement to use the today() function. You can use functions from other modules in your class definition, but you need to import the module first, and the import statement has to be outside class definition.

Write a class from scratch

You are a Python developer writing a visualization package. For any element in a visualization, you want to be able to tell the position of the element, how far it is from other elements, and easily implement horizontal or vertical flip .

The most basic element of any visualization is a single point. In this exercise, you’ll write a class for a point on a plane from scratch.

Define the class Point that has:

Two attributes, x and y - the coordinates of the point on the plane;
A constructor that accepts two arguments, x and y, that initialize the corresponding attributes. These arguments should have default value of 0.0;
A method distance_to_origin() that returns the distance from the point to the origin. The formula for that is 

. A method reflect(), that reflects the point with respect to the x- or y-axis:

accepts one argument axis,
if axis="x" , it sets the y (not a typo!) attribute to the negative value of the y attribute,
if axis="y", it sets the x attribute to the negative value of the x attribute,
for any other value of axis, prints an error message. 
from numpy import sqrt

# Write the class Point as outlined in the instructions
class Point:
    def __init__(self, x = 0.0, y = 0.0):
        self.x = x
        self.y = y

    def distance_to_origin(self):
        return sqrt(self.x**2 + self.y**2)
        
    def reflect(self, axis):
        if axis == "x":
            self.y = -1 * self.y
        elif axis == "y":
            self.x = -1 * self.x
        else:
            print("Error!")

test code:

pt = Point(x=3.0)
pt.reflect("y")
print((pt.x, pt.y))
pt.y = 4.0
print(pt.distance_to_origin())
(-3.0, 0.0)
5.0

Great work! Notice how you implemented distance_to_origin() as a method instead of an attribute. Implementing it as an attribute would be less sustainable - you would have to recalculate it every time you change the values of the x and y attributes to make sure the object state stays current.

Inheritance and Polymorphism

  • Inheritance: extends functionality of existing code
  • Polymorphism: creates a unified interface
  • Encapsulation: bundling of data and methods

Class Attributes

  • Instance-level data vs. class-level data
    • Consider the employee class we made before:
class Employee:
    def __init__(self, name, salary):
        self.name = name
        self.salary = salary

emp1 = Employee("Teo Mille", 50000)
emp2 = Employee("Marta Popov", 65000)
  • name and salary are instance attributes
    • used self to bind them to a particular instance
  • Data that can be shared among all instances of a class are class-level data, like a global variable:
    • e.g. a minimal salary across all employees
    • define it directly in the class block
class Employee:
    # Define a class attribute
    MIN_SALARY = 30000              #<--- no self

    def __init__(self, name, salary):
        self.name = name
        # Use class name to access class attribute
        if salary >= Employee.MIN_SALARY:
            self.salary = salary
        else:
            self.salary = Employee.MIN_SALARY

Note we don’t use self to define the class attribute. And we use the class_name.attr_name format to access the class attribute.

# all employees have min salary of 30000

emp1 = Employee("TBD", 40000)
print(emp1.MIN_SALARY)

emp2 = Employee("TBD", 60000)
print(emp2.MIN_SALARY)
30000
30000

Why use class attributes? Main case: global constants related to class - Min/max values - commonly used values, e.g. pi for a Circle

Class Methods

  • Methods are already shared between class instances
  • The only difference is the data fed into it
    • We can define methods bound to a class rather than an instance, but application is very narrow: can’t use any instance-level data
  • Define a class method with a @classmethod decorator
class MyClass:

    @classmethod                #<---use decorator to declare class method
    def my_method(cls, args...) #<---cls argument refers to the class
    # Do stuff here

Main difference is the first argument is NOT self but cls (referring to class; just like self refers to a particular instance)

To call a class method, use class.method syntax, rather than object.method syntax

MyClass.my_method(args...)

Main use case is for alternative constructors. - A class can only have one __init__() method, but might be multiple ways to initialize an object - e.g. might want to create an Employee object from data stored in a file - can’t use a method (would require an instance, and there isn’t one yet) - we introduce below a @classmethod from_file that accepts a file name, reads first line (presumably employee’s name), and returns an object instance

class Employee:
    MIN_SALARY = 30000
    def __init__(self, name, salary = 30000):
        self.name = name
        if salary >= Employee.MIN_SALARY:
            self.salary = salary
        else:
            self.salary = Employee.MIN_SALARY
    
    @classmethod
    def from_file(cls, filename):
        with open(filename, "r") as f:
            name = f.readline()
        return cls(name)

In the return statement, we use the cls variable, which will refer to the class and call __init__() constructor, just like using Employee() would

Call the method from_file using class.method syntax:

# Create an employee without calling Employee()
emp = Employee.from_file("employee_data.txt")
type(emp)

# type is:
# __main__.Employee

example

In this exercise, you will be a game developer working on a game that will have several players moving on a grid and interacting with each other. As the first step, you want to define a Player class that will just move along a straight line. Player will have a position attribute and a move() method. The grid is limited, so the position of Player will have a maximal value.

Define a class Player that has:
A class attribute MAX_POSITION with value 10.
The __init__() method that sets the position instance attribute to 0.
Print Player.MAX_POSITION.
Create a Player object p and print its MAX_POSITION.
# Create a Player class
class Player:
    MAX_POSITION = 10
    def __init__(self, position = 0):
        self.position = position


# Print Player.MAX_POSITION       
print(Player.MAX_POSITION)

# Create a player p and print its MAX_POSITITON
p = Player()
print(p.MAX_POSITION)
10
10

Add a move() method with a steps parameter such that:

if position plus steps is less than MAX_POSITION, then add steps to position and assign the result back to position;
otherwise, set position to MAX_POSITION.
class Player:
    MAX_POSITION = 10
    MAX_SPEED = 3
    def __init__(self):
        self.position = 0

    # Add a move() method with steps parameter
    def move(self, steps):
        if self.position + steps < Player.MAX_POSITION:
            self.position = self.position + steps
        else:
            self.position = Player.MAX_POSITION
    
       
    # This method provides a rudimentary visualization in the console    
    def draw(self):
        drawing = "-" * self.position + "|" +"-"*(Player.MAX_POSITION - self.position)
        print(drawing)

p = Player(); p.draw()
p.move(4); p.draw()
p.move(5); p.draw()
p.move(3); p.draw()
|----------
----|------
---------|-
----------|

You learned how to define class attributes and how to access them from class instances. So what will happen if you try to assign another value to a class attribute when accessing it from an instance? The answer is not as simple as you might think!

The Player class from the previous exercise is pre-defined. Recall that it has a position instance attribute, and MAX_SPEED and MAX_POSITION class attributes. The initial value of MAX_SPEED is 3.

Create two Player objects p1 and p2.
Print p1.MAX_SPEED and p2.MAX_SPEED.
Assign 7 to p1.MAX_SPEED.
Print p1.MAX_SPEED and p2.MAX_SPEED again.
Print Player.MAX_SPEED.
Examine the output carefully.
# Create Players p1 and p2
p1 = Player()
p2 = Player()

print("MAX_SPEED of p1 and p2 before assignment:")
# Print p1.MAX_SPEED and p2.MAX_SPEED
print(p1.MAX_SPEED)
print(p2.MAX_SPEED)

# Assign 7 to p1.MAX_SPEED
p1.MAX_SPEED = 7

print("MAX_SPEED of p1 and p2 after assignment:")
# Print p1.MAX_SPEED and p2.MAX_SPEED
print(p1.MAX_SPEED)
print(p2.MAX_SPEED)

print("MAX_SPEED of Player:")
# Print Player.MAX_SPEED
print(Player.MAX_SPEED)
MAX_SPEED of p1 and p2 before assignment:
3
3
MAX_SPEED of p1 and p2 after assignment:
7
3
MAX_SPEED of Player:
3

Even though MAX_SPEED is shared across instances, assigning 7 to p1.MAX_SPEED didn’t change the value of MAX_SPEED in p2, or in the Player class.

So what happened? In fact, Python created a new instance attribute in p1, also called it MAX_SPEED, and assigned 7 to it, without touching the class attribute.

Now let’s change the class attribute value for real.

Modify the assignment to assign 7 to Player.MAX_SPEED instead.
# Create Players p1 and p2
p1, p2 = Player(), Player()

print("MAX_SPEED of p1 and p2 before assignment:")
# Print p1.MAX_SPEED and p2.MAX_SPEED
print(p1.MAX_SPEED)
print(p2.MAX_SPEED)

# ---MODIFY THIS LINE--- 
Player.MAX_SPEED = 7

print("MAX_SPEED of p1 and p2 after assignment:")
# Print p1.MAX_SPEED and p2.MAX_SPEED
print(p1.MAX_SPEED)
print(p2.MAX_SPEED)

print("MAX_SPEED of Player:")
# Print Player.MAX_SPEED
print(Player.MAX_SPEED)
MAX_SPEED of p1 and p2 before assignment:
3
3
MAX_SPEED of p1 and p2 after assignment:
7
7
MAX_SPEED of Player:
7

Not obvious, right? But it makes sense, when you think about it! You shouldn’t be able to change the data in all the instances of the class through a single instance. Imagine if you could change the time on all the computers in the world by changing the time on your own computer! If you want to change the value of the class attribute at runtime, you need to do it by referring to the class name, not through an instance.

Alternative constructors

Python allows you to define class methods as well, using the @classmethod decorator and a special first argument cls. The main use of class methods is defining methods that return an instance of the class, but aren’t using the same code as init().

For example, you are developing a time series package and want to define your own class for working with dates, BetterDate. The attributes of the class will be year, month, and day. You want to have a constructor that creates BetterDate objects given the values for year, month, and day, but you also want to be able to create BetterDate objects from strings like 2020-04-30.

You might find the following functions useful:

.split("-") method will split a string at"-" into an array, e.g. "2020-04-30".split("-") returns ["2020", "04", "30"],
int() will convert a string into a number, e.g. int("2019") is 2019 .

Add a class method from_str() that:

accepts a string datestr of the format'YYYY-MM-DD',
splits datestr and converts each part into an integer,
returns an instance of the class with the attributes set to the values extracted from datestr.
class BetterDate:    
    # Constructor
    def __init__(self, year, month, day):
      # Recall that Python allows multiple variable assignments in one line
      self.year, self.month, self.day = year, month, day
    
    # Define a class method from_str
    @classmethod
    def from_str(cls, datestr):
        # Split the string at "-" and convert each part to integer
        parts = datestr.split("-")
        year, month, day = int(parts[0]), int(parts[1]), int(parts[2])
        # Return the class instance
        return cls(year, month, day)
        
bd = BetterDate.from_str('2020-04-30')   
print(bd.year)
print(bd.month)
print(bd.day)
2020
4
30
# import datetime from datetime
from datetime import datetime

class BetterDate:
    def __init__(self, year, month, day):
      self.year, self.month, self.day = year, month, day
      
    @classmethod
    def from_str(cls, datestr):
        year, month, day = map(int, datestr.split("-"))
        return cls(year, month, day)
      
    # Define a class method from_datetime accepting a datetime object
    @classmethod
    def from_datetime(cls, dateobj):
      year, month, day = dateobj.year, dateobj.month, dateobj.day
      return cls(year, month, day) 


# You should be able to run the code below with no errors: 
today = datetime.today()     
bd = BetterDate.from_datetime(today)   
print(bd.year)
print(bd.month)
print(bd.day)
2024
7
19

Great work on those class methods! There’s another type of methods that are not bound to a class instance - static methods, defined with the decorator @staticmethod. They are mainly used for helper or utility functions that could as well live outside of the class, but make more sense when bundled into the class. Static methods are beyond the scope of this class, but you can read about them here.

Class Inheritance

OOP is fundamentally about code reuse

Class inheritance is a mechanism by which we can define a new class that gets all the functionality of another class plus something extra

e.g. a BankAccount class with - balance attribute - withdraw() method

could create a SavingsAccount that also has - interest_rate attribute - compute_interest() method - (and still has balance and withdraw())

Declaring a class that inherits from another class - MyParent: class to be inherited - MyChild: class that will inherit MyParent plus more functionality

class MyChild(MyParent):
class BankAccount:
    def __init__(self, balance):
        self.balance = balance

    def withdraw(self, amount):
        self.balance =- amount

# Empty class inherited from BankAccount
class SavingsAccount(BankAccount):
    pass          #<-- seems empty but contains entire BankAccount class
# Constructor inherited from BankAccount
savings_acct = SavingsAccount(1000)
type(savings_acct)

# Attribute inherited from BankAccount
savings_acct.balance

# Method inherited from BankAccount
savings_acct.withdraw(300)

Inheritance is a “is-a” relationship: - SavingsAccount is a BankAccount (possibly with special features)

isinstance(savings_acct, SavingsAccount)
isinstance(savings_acct, BankAccount)

# Python treats this as an instance of both classes!
True

examples

Create a subclass

The purpose of child classes – or sub-classes, as they are usually called - is to customize and extend functionality of the parent class.

Recall the Employee class from earlier in the course. In most organizations, managers enjoy more privileges and more responsibilities than a regular employee. So it would make sense to introduce a Manager class that has more functionality than Employee.

But a Manager is still an employee, so the Manager class should be inherited from the Employee class.

class Employee:
  MIN_SALARY = 30000    

  def __init__(self, name, salary=MIN_SALARY):
      self.name = name
      if salary >= Employee.MIN_SALARY:
        self.salary = salary
      else:
        self.salary = Employee.MIN_SALARY
        
  def give_raise(self, amount):
      self.salary += amount      
        
# Define a new class Manager inheriting from Employee
class Manager(Employee):
  pass

# Define a Manager object
mng = Manager("Debbie Lashko", salary = 86500)

# Print mng's name
print(mng.name)
Debbie Lashko
Remove the pass statement and add a display() method to the Manager class that just prints the string "Manager" followed by the full name, e.g. "Manager Katie Flatcher"
Call the .display()method from the mnginstance.
class Employee:
  MIN_SALARY = 30000    

  def __init__(self, name, salary=MIN_SALARY):
      self.name = name
      if salary >= Employee.MIN_SALARY:
        self.salary = salary
      else:
        self.salary = Employee.MIN_SALARY
  def give_raise(self, amount):
    self.salary += amount      
        
# MODIFY Manager class and add a display method
class Manager(Employee):
  def display(self):
    print("Manager", self.name)

mng = Manager("Debbie Lashko", 86500)
print(mng.name)

# Call mng.display()
mng.display()
Debbie Lashko
Manager Debbie Lashko

Excellent! You already started customizing! The Manager class now includes functionality that wasn’t present in the original class (the display() function) in addition to all the functionality of the Employee class. Notice that there wasn’t anything special about adding this new method.

Adding Functionality to Subclasses

  • Can run constructor of the parent first by ParentClass.__init__(self, args...)
    • self is both the child and parent class
    • not required to call parent constructors in the subclass
class SavingsAccount(BankAccount):

    # Constructor specifically for SavingsAccount with additional param
    def __init__(self, balance, interest_rate):
        # Call parent constructor using classname.__init__()
        BankAccount.__init__(self, balance) #<--- self is a SavingsAccount but also a BankAccount
        # Add functionality
        self.interest_rate = interest_rate
# Construct object with new constructor
acct = SavingsAccount(1000, 0.03)
acct.interest_rate
0.03
  • Can use data from both parent and child class
class SavingsAccount(BankAccount):

    def __init__(self, balance, interest_rate):
        BankAccount.__init__(self, balance)
        self.interest_rate = interest_rate
    
    # New function
    def compute_interest(self, n_periods = 1):
        return self.balance * ( (1 + self.interest_rate) ** n_periods - 1)
  • note we use the balance data from parent class BankAccount with interest_rate attribute that only exists in child subclasss

Customizing functionality for subclasses

Consider a CheckingAccount subclass that takes BankAccount withdraw() method but customizes by adding a parameter

class CheckingAccount(BankAccount):
    def __init__(self, balance, limit):
        BankAccount.__init__(self, balance)
        self.limit = limit
    
    def deposit(self, amount):
        self.balance += amount
    
    def withdraw(self, amount, fee = 0):   # add a fee
        if fee <= self.limit:
            BankAccount.withdraw(self, amount + fee)
        else:
            BankAccount.withdraw(self, amount + self.limit)
check_acct = CheckingAccount(1000, 25)

# Will call withdraw from CheckingAccount
check_acct.withdraw(200) 
check_acct.withdraw(200, fee =15)  #<-- can use two parameters (fee)
bank_acct = BankAccount(1000)

# Will call withdraw from BankAccount
bank_acct.withdraw(200) # can't use a fee! will get error

The call is the same (withdraw()), but the actual method used is determined by the instance class (CheckingAccount or BankAccount) - an application of polymorphism.

examples

Inheritance is powerful because it allows us to reuse and customize code without rewriting existing code. By calling methods of the parent class within the child class, we reuse all the code in those methods, making our code concise and manageable.

In this exercise, you’ll continue working with the Manager class that is inherited from the Employee class. You’ll add new data to the class, and customize the give_raise() method from Chapter 1 to increase the manager’s raise amount by a bonus percentage whenever they are given a raise.

A simplified version of the Employee class, as well as the beginning of the Manager class from the previous lesson is provided for you in the script pane.

class Employee:
    def __init__(self, name, salary=30000):
        self.name = name
        self.salary = salary

    def give_raise(self, amount):
        self.salary += amount

        
class Manager(Employee):
  # Add a constructor 
    def __init__(self, name, salary = 50000, project = None):

        # Call the parent's constructor   
        Employee.__init__(self, name, salary)

        # Assign project attribute
        self.project = project  

  
    def display(self):
        print("Manager ", self.name)
class Employee:
    def __init__(self, name, salary=30000):
        self.name = name
        self.salary = salary

    def give_raise(self, amount):
        self.salary += amount

        
class Manager(Employee):
    def display(self):
        print("Manager ", self.name)

    def __init__(self, name, salary=50000, project=None):
        Employee.__init__(self, name, salary)
        self.project = project

    # Add a give_raise method
    def give_raise(self, amount, bonus=1.05):
        new_amount = amount * bonus
        Employee.give_raise(self, new_amount)
    
    
mngr = Manager("Ashta Dunbar", 78500)
mngr.give_raise(1000)
print(mngr.salary)
mngr.give_raise(2000, bonus=1.03)
print(mngr.salary)
79550.0
81610.0

Good work! In the new class, the use of the default values ensured that the signature of the customized method was compatible with its signature in the parent class. But what if we defined Manager’s’give_raise() to have 2 non-optional parameters? What would be the result of mngr.give_raise(1000)? Experiment in console and see if you can understand what’s happening. Adding print statements to both give_raise() could help!

Inheritance of class attributes

In the beginning of this chapter, you learned about class attributes and methods that are shared among all the instances of a class. How do they work with inheritance?

In this exercise, you’ll create subclasses of the Player class from the first lesson of the chapter, and explore the inheritance of class attributes and methods.

The Player class has been defined for you. Recall that the Player class had two class-level attributes: MAX_POSITION and MAX_SPEED, with default values 10 and 3.

# Create a Racer class and set MAX_SPEED to 5
class Racer(Player):
    MAX_SPEED = 5
 
# Create a Player and a Racer objects
p = Player()
r = Racer()

print("p.MAX_SPEED = ", p.MAX_SPEED)
print("r.MAX_SPEED = ", r.MAX_SPEED)

print("p.MAX_POSITION = ", p.MAX_POSITION)
print("r.MAX_POSITION = ", r.MAX_POSITION)
p.MAX_SPEED =  7
r.MAX_SPEED =  5
p.MAX_POSITION =  10
r.MAX_POSITION =  10

Class attributes CAN be inherited, and the value of class attributes CAN be overwritten in the child class

Correct! But notice that the value of MAX_SPEED in Player was not affected by the changes to the attribute of the same name in Racer.

Customizing a DataFrame

In your company, any data has to come with a timestamp recording when the dataset was created, to make sure that outdated information is not being used. You would like to use pandas DataFrames for processing data, but you would need to customize the class to allow for the use of timestamps.

In this exercise, you will implement a small LoggedDF class that inherits from a regular pandas DataFrame but has a created_at attribute storing the timestamp. You will then augment the standard to_csv() method to always include a column storing the creation date.

Tip: all DataFrame methods have many parameters, and it is not sustainable to copy all of them for each method you’re customizing. The trick is to use variable-length arguments *args and **kwargsto catch all of them.

# Import pandas as pd
import pandas as pd

# Define LoggedDF inherited from pd.DataFrame and add the constructor
class LoggedDF(pd.DataFrame):
    def __init__(self, *args, **kwargs):
        pd.DataFrame.__init__(self, *args, **kwargs)
        self.created_at = datetime.today()
    
ldf = LoggedDF({"col1": [1,2], "col2": [3,4]})
print(ldf.values)
print(ldf.created_at)
[[1 3]
 [2 4]]
2024-07-19 12:46:41.881625
# Import pandas as pd
import pandas as pd

# Define LoggedDF inherited from pd.DataFrame and add the constructor
class LoggedDF(pd.DataFrame):
  
  def __init__(self, *args, **kwargs):
    pd.DataFrame.__init__(self, *args, **kwargs)
    self.created_at = datetime.today()
    
  def to_csv(self, *args, **kwargs):
    # Copy self to a temporary DataFrame
    temp = self.copy()
    
    # Create a new column filled with self.created_at
    temp["created_at"] = self.created_at
    
    # Call pd.DataFrame.to_csv on temp, passing in *args and **kwargs
    pd.DataFrame.to_csv(temp, *args, **kwargs)

Incredible work! Using *args and **kwargs allows you to not worry about keeping the signature of your customized method compatible. Notice how in the very last line, you called the parent method and passed an object to it that isn’t self. When you call parent methods in the class, they should accept some object as the first argument, and that object is usually self, but it doesn’t have to be!

Integrating with Python

Operator Overloading

Object equality

class Customer:
    def __init__(self, name, balance, id):
        self.name, self.balance = name, balance
        self.id = id

customer1 = Customer("Maryam Azar", 3000, 123)
customer2 = Customer("Maryam Azar", 3000, 123)
customer1 == customer2
False

Python doesn’t consider two objects with the same data equal by default - has to do with how python stores objects and variables representing them are stored

print(customer1)

print(customer2)
<__main__.Customer object at 0x1033e7bb0>
<__main__.Customer object at 0x1033e7ac0>

Printing the value of the objects shows “Customer at” and a string (number as hexidecimal) (note they are different strings)

  • Behind the scenes, when an object is created, python allocates a chunk of memory to it, and variable the object is assigned to actually just contains a reference to the memory chunk.
  • Above, we’ve allocated two chunks of memory to two objects labeled customer1 and customer2
  • When we compare customer1 == customer2, we are actually just comparing the references, not the underlying data, and these references point to different locations in memory, thus, not equal
import numpy as np

# Two different arrays containing the same data
array1 = np.array([1,2,3])
array2 = np.array([1,2,3])

array1 == array2
array([ True,  True,  True])

NumPy arrays, however, are compared using their data! Same with Pandas DataFrames, etc.

We can define a special method for our classes to ensure equality by their data:

  • __eq__() method is implictly called whenever two objects of same class are compared (using ==)
    • can redefine this method to use custom comparison code
    • two objects to be compared: self, other, should return Boolean
class Customer:
    def __init__(self, id, name):
        self.id, self.name = id, name
    # Will be called when == is used
    def __eq__(self, other):
        # Diagnostic printout
        print("__eq__() is called")

        # Returns True if all attributes match
        return (self.id == other.id) and \
            (self.name == other.name)
customer1 = Customer(123, "Maryam Azar")
customer2 = Customer(123, "Maryam Azar")
customer1 == customer2
__eq__() is called
True

Python allows you to implement all comparison operators in custom class using special methods:

Special methods
Operator Method
== __eq__()
!= __ne__()
>= __ge__()
<= __le__()
> __gt__()
< __lt__()
  • __hash__() method uses objects as dict keys and in sets
    • briefly, assigns integer to an object, such that equal objects have equal hashes, and object hash does not change through the object’s lifetime

examples

Overloading equality

When comparing two objects of a custom class using ==, Python by default compares just the object references, not the data contained in the objects. To override this behavior, the class can implement the special eq() method, which accepts two arguments – the objects to be compared – and returns True or False. This method will be implicitly called when two objects are compared.

The BankAccount class from the previous chapter is available for you in the script pane. It has one attribute, balance, and a withdraw() method. Two bank accounts with the same balance are not necessarily the same account, but a bank account usually has an account number, and two accounts with the same account number should be considered the same.

class BankAccount:
   # MODIFY to initialize a number attribute
    def __init__(self, number, balance=0):
        self.balance = balance
        self.number = number
      
    def withdraw(self, amount):
        self.balance -= amount 
    
    # Define __eq__ that returns True if the number attributes are equal 
    def __eq__(self, other):
        return self.number == other.number   

# Create accounts and compare them       
acct1 = BankAccount(123, 1000)
acct2 = BankAccount(123, 1000)
acct3 = BankAccount(456, 1000)
print(acct1 == acct2)
print(acct1 == acct3)
True
False

Great job! Notice that your method compares just the account numbers, but not balances. What would happen if two accounts have the same account number but different balances? The code you wrote will treat these accounts as equal, but it might be better to throw an error - an exception - instead, informing the user that something is wrong. At the end of the chapter, you’ll learn how to define your own exception classes to create these kinds of custom errors.

example

Checking class equality

In the previous exercise, you defined a BankAccount class with a number attribute that was used for comparison. But if you were to compare a BankAccount object to an object of another class that also has a number attribute, you could end up with unexpected results.

For example, consider two classes

class Phone:
  def __init__(self, number):
     self.number = number

  def __eq__(self, other):
    return self.number == \
          other.number

pn = Phone(873555333)

class BankAccount:
  def __init__(self, number):
     self.number = number

  def __eq__(self, other):
    return self.number == \
           other.number

acct = BankAccount(873555333)

Running acct == pn will return True, even though we’re comparing a phone number with a bank account number.

It is good practice to check the class of objects passed to the eq() method to make sure the comparison makes sense.

my turn

class BankAccount:
    def __init__(self, number, balance=0):
        self.number, self.balance = number, balance
      
    def withdraw(self, amount):
        self.balance -= amount 

    # MODIFY to add a check for the type()
    def __eq__(self, other):
        return (self.number == other.number) and (type(self) == type(other))    

acct = BankAccount(873555333)      
pn = Phone(873555333)
print(acct == pn)
False

Perfect! Now only comparing objects of the same class BankAccount could return True. Another way to ensure that an object has the same type as you expect is to use the isinstance(obj, Class) function. This can helpful when handling inheritance, as Python considers an object to be an instance of both the parent and the child class. Try running pn == acct in the console (with reversed order of equality). What does this tell you about the __eq__() method?

Correct! Python always calls the child’s eq() method when comparing a child object to a parent object.

Operator overloading: string representation

  • We saw above that printing an object of a custom class returns the object’s location in memory, by default
    • But most classes’ printout is much more informative (e.g. numpy array or dataframe shows the actual data in object)
  • 2 special methods we can define to createa printable representation of the object
    • __str__()
      • executed with print() or str() on object
      • supposed to give an informal representation, suitable to end users
    • __repr__()
      • executed with repr() or printed in console without explicit print() command
      • supposed to be more formal, mainly for developers
    • best practice: use repr() to reprint a string that can be used to reproduce the object (reproducible representation); e.g. a numpy array shows the exact method call used to create the object
      • if you only do one, do repr() - also used for fallback for print() when str() is not defined
import numpy as np
print(np.array([1,2,3]))
str(np.array([1,2,3]))
repr(np.array([1,2,3]))
[1 2 3]
'array([1, 2, 3])'
class Customer:
    def __init__(self, name, balance):
        self.name, self.balance = name, balance
    
    def __str__(self):      # only argument is self; return a string
        cust_str = """     
        Customer:
            name: {name}
            balance: {balance}
        """.format(name = self.name, \
                   balance = self.balance) 
        return cust_str
cust = Customer("Maryam Azar", 3000)

# Will implicitly call __str__()
print(cust)
     
        Customer:
            name: Maryam Azar
            balance: 3000
        
# alternatively, use repr()
class Customer:
    def __init__(self, name, balance):
        self.name, self.balance = name, balance
    
    # best practice: return the string that can reproduce object
    def __repr__(self):      # notice the quotes around name
        return "Customer('{name}', {balance})".format(name = self.name, balance = self.balance)
cust = Customer("Miryam Azar", 3000)
cust                    # will implicitly call __repr__()
Customer('Miryam Azar', 3000)

Note it’s not Customer(Maryam Azar, 3000) (note the single quotes in the returned statement; without the quotes, the name of the customer would be substituted into the string as-is, but the point of repr() is to give the exact call needed to reproduce the object, so name should be in quotes)

examples

Great work! To recap: to format a string with variables, you can either use keyword arguments in .format ('Insert {n} here'.format(n=num)), refer to them by position index explicitly (like 'Insert {0} here'.format(num)) or implicitly (like 'Insert {} here'.format(num)). You can use double quotation marks inside single quotation marks and the way around, but to nest the same set of quotation marks, you need to escape them with a slash like \".

class Employee:
    def __init__(self, name, salary=30000):
        self.name, self.salary = name, salary
            
    # Add the __str__() method
    def __str__(self):

        emp_str = """
        Employee name: {name}
        Employee salary: {salary}
        """.format(name = self.name, \
                   salary = self.salary)
        return emp_str

emp1 = Employee("Amar Howard", 30000)
print(emp1)
emp2 = Employee("Carolyn Ramirez", 35000)
print(emp2)

        Employee name: Amar Howard
        Employee salary: 30000
        

        Employee name: Carolyn Ramirez
        Employee salary: 35000
        
class Employee:
    def __init__(self, name, salary=30000):
        self.name, self.salary = name, salary
      

    def __str__(self):
        s = "Employee name: {name}\nEmployee salary: {salary}".format(name=self.name, salary=self.salary)      
        return s
      
    # Add the __repr__method  
    def __repr__(self):
        return "Employee('{name}', {salary})".format(name = self.name, salary = self.salary)   

emp1 = Employee("Amar Howard", 30000)
print(repr(emp1))
emp2 = Employee("Carolyn Ramirez", 35000)
print(repr(emp2))
Employee('Amar Howard', 30000)
Employee('Carolyn Ramirez', 35000)

Exceptions

Exceptions/errors raised in execution of code

Often want to prevent the program from terminating when exception raised

  • try - except - finally code
    • wrap the code you’re worried about in a try block
    • then add an except block, followed by name of particular exception you want to handle, and code that should be executed when an exception is raised
      • can have multiple exception blocks
    • finally will run no matter what
try:
    # Try running some code
except ExceptionNameHere:
    # Run this code if ExceptionNameHere happens
except AnotherExceptionHere:
    # Run this code if AnotherExceptionHere happens
finally:
    # Run this code no matter what

Sometimes you want to raise exceptions yourself, such as when some conditions aren’t satisfied - raise, optionally followed by a message in ()

def make_list_of_ones(length):
    if length <= 0:
        raise ValueError("Invalid length!") #<-- Will stop program and raise error

    return [1]*length
#make_list_of_ones(-1) # gives ValueError: Invalid length!

In python, exceptions are actually classes inherited from built-in classes BaseException or Exception - https://docs.python.org/3/library/exceptions.html

To define a custom exception, define a class that inherits from Exception class or one of its subclasses, the class itself can be empty, inheritance alone is enough

class BalanceError(Exception): pass
class Customer:
    def __init__(self, name, balance):
        if balance < 0 :
            raise BalanceError("Balance has to be non-negative!")
        else:
            self.name, self.balance = name, balance
#cust = Customer("Larry Tores", -100)

#cust

Note here the constructor terminates, and customer object is not created at all

better than creating object and setting balance to 0 (as above)

customer user can then try if they want (but we, the author of the code don’t decide for them):

try:
    cust = Customer("Larry Tores", -100)

except BalanceError:
    cust = Customer("Larry Tores", 0)

examples

# MODIFY the function to catch exceptions
def invert_at_index(x, ind):
  try:
    return 1/x[ind]
  except ZeroDivisionError:
    print("Cannot divide by zero!")
  except IndexError:
    print("Index out of range!")
 
a = [5,6,0,7]

# Works okay
print(invert_at_index(a, 1))

# Potential ZeroDivisionError
print(invert_at_index(a, 2))

# Potential IndexError
print(invert_at_index(a, 5))
0.16666666666666666
Cannot divide by zero!
None
Index out of range!
None

You don’t have to rely solely on built-in exceptions like IndexError: you can define your own exceptions more specific to your application. You can also define exception hierarchies. All you need to define an exception is a class inherited from the built-in Exception class or one of its subclasses.

In Chapter 1, you defined an Employee class and used print statements and default values to handle errors like creating an employee with a salary below the minimum or giving a raise that is too big. A better way to handle this situation is to use exceptions. Because these errors are specific to our application (unlike, for example, a division by zero error which is universal), it makes sense to use custom exception classes.

# Define SalaryError inherited from ValueError
class SalaryError(ValueError): pass

# Define BonusError inherited from SalaryError
class BonusError(SalaryError): pass
class SalaryError(ValueError): pass
class BonusError(SalaryError): pass

class Employee:
  MIN_SALARY = 30000
  MAX_RAISE = 5000

  def __init__(self, name, salary = 30000):
    self.name = name
    
    # If salary is too low
    if salary < MIN_SALARY:
      # Raise a SalaryError exception
      raise SalaryError("Salary is too low!")
      
    self.salary = salary
class SalaryError(ValueError): pass
class BonusError(SalaryError): pass

class Employee:
  MIN_SALARY = 30000
  MAX_BONUS = 5000

  def __init__(self, name, salary = 30000):
    self.name = name    
    if salary < Employee.MIN_SALARY:
      raise SalaryError("Salary is too low!")      
    self.salary = salary
    
  # Rewrite using exceptions  
  def give_bonus(self, amount):
    if amount > Employee.MAX_BONUS:
       print("The bonus amount is too high!")  
        
    elif self.salary + amount <  Employee.MIN_SALARY:
       print("The salary after bonus is too low!")
      
    else:  
      self.salary += amount
class SalaryError(ValueError): pass
class BonusError(SalaryError): pass

class Employee:
  MIN_SALARY = 30000
  MAX_BONUS = 5000

  def __init__(self, name, salary = 30000):
    self.name = name    
    if salary < Employee.MIN_SALARY:
      raise SalaryError("Salary is too low!")      
    self.salary = salary
    
  # Rewrite using exceptions  
  def give_bonus(self, amount):
    if amount > Employee.MAX_BONUS:
       raise BonusError("The bonus amount is too high!")  
        
    elif self.salary + amount <  Employee.MIN_SALARY:
       print("The salary after bonus is too low!")
      
    else:  
      self.salary += amount

Wonderful! Notice that if you raise an exception inside an if statement, you don’t need to add an else branch to run the rest of the code. Because raise terminates the function, the code after raise will only be executed if an exception did not occur.

Exactly! It’s better to list the except blocks in the increasing order of specificity, i.e. children before parents, otherwise the child exception will be called in the parent except block.

Best Practices of Class Design

Designing for Inheritance and Polymorphism

Polymorphism: using a unified interface to operate on objects of different classes

All that matters is the interface

# Withdraw amount from each of the accounts in list_of_accounts
# function doesn't know or care whether objects are Checking, Savings, BankAccount, etc
#   all that matters is they have a withdraw() method with one argument

def batch_withdraw(list_of_accounts, amount):
    for acct in list_of_accounts:
        acct.withdraw(amount)
    b,c,s = BankAccount(1000), CheckingAccount(2000), SavingsAccount(3000)
    batch_withdraw([b,c,s]) #<--- Will use BankAccount.withdraw(),
                                # then CheckingAccount.withdraw(),
                                # then SavingsAccount.withdraw()

# when withdraw method is actually called, python will dynamically pull the correct method (checking withdraw or bankacct withdraw)

batch_withdraw() doesn’t need to check object to know which withdraw() method to call - to make use of this, you have to design your classes with inheritance and polymorphism in mind

Object-oriented design principle of when & how to use inheritance: Liskov substitution principle (LSP):

“Base class should be interchangeable with any of its subclasses without altering any properties of the program”

e.g. whenever in our app we use BankAccount instance, substituting a CheckingAccount instead should not affect anything in the surrounding program - e.g. batch_withdraw() method worked regardless of kind of account

Syntactically: - method in subclass should have a signature with parameters and returned values compatible with the method in the parent class Semantically: - state of objects should stay consistent, subclass method shouldn’t rely on stronger input conditions, should not provide weaker output conditions, should not throw additional exceptions

Possible violations of LSP (where can’t use subclass in place of parent class):

  • syntactic imcompatibility
    • BankAccount.withdraw() requires 1 parameter, but CheckingAccount.withdraw() requires 2
      • Couldn’t use the subclass’s withdraw() in place of the parent’s
      • But if subclass had a default value for the second param, then no problem!
  • subclass strengthening input conditions
    • BankAccount.withdraw() accepts any amount, but CheckingAccount.withdraw() assumes amount is limited
  • subclass weakening output conditions
    • BankAccount.withdraw() can only leave a positive balance or cause an error, but CheckingAccount.withdraw() can leave negative balance
  • changing additional attributes in subclass’ method
  • throwing additional exceptions in subclass’ method

Ultimate rule: if your class hierarchy violates LSP, then you should not be using inheritance

examples

circle-ellipse problem

Square and rectangle

The classic example of a problem that violates the Liskov Substitution Principle is the Circle-Ellipse problem, sometimes called the Square-Rectangle problem.

By all means, it seems like you should be able to define a class Rectangle, with attributes h and w (for height and width), and then define a class Square that inherits from the Rectangle. After all, a square “is-a” rectangle!

Unfortunately, this intuition doesn’t apply to object-oriented design.

# Define a Rectangle class
class Rectangle:
    def __init__(self, h, w):
      self.h, self.w = h, w

# Define a Square class
class Square(Rectangle):
    def __init__(self, w):
      self.h, self.w = w, w  

A Square inherited from a Rectangle will always have both the h and w attributes, but we can’t allow them to change independently of each other.

class Rectangle:
    def __init__(self, w,h):
      self.w, self.h = w,h

# Define set_h to set h      
    def set_h(self, h):
      self.h = h
      
# Define set_w to set w          
    def set_w(self, w):
      self.w = w
      
      
class Square(Rectangle):
    def __init__(self, w):
      self.w, self.h = w, w 

# Define set_h to set w and h
    def set_h(self, h):
      self.h = h
      self.w = h

# Define set_w to set w and h      
    def set_w(self, w):
      self.h = w
      self.w = w 

Later in this chapter you’ll learn how to make these setter methods run automatically when attributes are assigned new values, don’t worry about that for now, just assume that when we assign a value to h of a square, now the w attribute will be changed accordingly.

How does using these setter methods violate Liskov Substitution principle?

Each of the setter methods of Square change both h and w attributes, while setter methods of Rectangle change only one attribute at a time, so the Square objects cannot be substituted for Rectangle into programs that rely on one attribute staying constant.

Correct! Remember that the substitution principle requires the substitution to preserve the oversall state of the program. An example of a program that would fail when this substitution is made is a unit test for a setter functions in Rectangle class.

Managing Data Access

In python, all class data is public - any attribute or method can be accessed by anyone - Fundamental python principle: “we are all adults here” - you should have trust in your fellow developers

  • Naming conventions: can use some universal naming conventions to signal that data is not for external consumption
    • special kinds of attributes called @property that allow you to control how each attribute is modified
  • Special methods that you can override to change how attributes are used
    • override __getattr__() and __setattr__()

Naming conventions: internal attributes - single leading underscore to indicate an attribute/method that isn’t a part of public class interface, and can change without notice - obj._att_name, obj._method_name() - not part of public API - as a class user: ‘dont touch this’ - as a developer: use for implementation details, helper functions - e.g. pandas dataframe has df._is_mixed_type attribute that indicates whether it contains data of mixed types - datetime module contains datetime._ymd2ord() function that converts a date into a number containing how many days have passed since jan 1 of year 1. - double leading underscore are (closest to) “private” - this data is not inherited because Python implements name mangling: any name starting with a double underscore will be automatically prepended by the name of the class, and that new name will be hte actual internal name of the attribute or method - obj.__attr_name interpreted as obj._MyClass_attr_name - main use is to prevent name clashes in child classes (possible that someone will unknowingly create an attribute or method in a child class that overwrites one in the parent class) - can use double-leading underscores to protect important attributes/methods that should not be overridden - leading AND trailing double underscores are only used for Python built-in methods like __init__()

example

Using internal attributes

In this exercise, you’ll return to the BetterDate class of Chapter 2.

You decide to add a method that checks the validity of the date, but you don’t want to make it a part of BetterDate’s public interface.

The class BetterDate is available in the script pane.

# Add class attributes for max number of days and months
class BetterDate:
    _MAX_DAYS = 31
    _MAX_MONTHS = 12
    
    def __init__(self, year, month, day):
        self.year, self.month, self.day = year, month, day
        
    @classmethod
    def from_str(cls, datestr):
        year, month, day = map(int, datestr.split("-"))
        return cls(year, month, day)
        
    # Add _is_valid() checking day and month values
    def _is_valid(self):
        return (self.day <= BetterDate._MAX_DAYS) and \
               (self.month <= BetterDate._MAX_MONTHS)
        
bd1 = BetterDate(2020, 4, 30)
print(bd1._is_valid())

bd2 = BetterDate(2020, 6, 45)
print(bd2._is_valid())
True
False

Great job! Notice that you were still able to use the _is_valid() method as usual. The single underscore naming convention is purely a convention, and Python doesn’t do anything special with such attributes and methods behind the scenes. That convention is widely followed, though, so if you see an attribute name with one leading underscore in someone’s class - don’t use it! The class developer trusts you with this responsibility.

Properties

  • Properties are a special kind of attribute that allows customized access

Consider our Employee class - we saw that you can access and set an attribute, like name or salary by assignmant in an instance (as opposed to the constructor) - but this means we can assign anything! A negative number, a million, the word “hello”, etc - salary is an important attribute, so this should not be allowed - we want to control attribute access (to salary), validate it, or even make it read-only

Use @property decorator

  • define an ‘internal’/‘protected’ attribute that will store the data with leading _
    • then use @property to define a method whose name is exactly the name of the restricted attribute; return the internal attribute
    • use @attr.seteter on a method attr() to customize how it is set, that will be called on obj.attr = value
class Employer:
    def __init__(self, name, new_salary):
        self._salary = new_salary
    
    @property
    def salary(self):
        return self._salary
    
    @salary.setter
    def salary(self, new_salary):
        if new_salary < 0:
            raise ValueError("Invalid salary")
        self._salary = new_salary
emp = Employer("Miriam Azari", 35000)
# accessing the "property"
emp.salary

emp.salary = 60000 #<-- @salary.setter

if you do not define a @attr.setter method, the property will be read-only, like Dataframe shape()

  • @attr.getter is used for method that is called when property’s value is retrieved

  • @attr.deleter is used for method that is called when property’s value is deleted using del

examples

# Create a Customer class
class Customer:
    def __init__(self, name, new_bal):
        self.name = name
        if new_bal < 0:
           raise ValueError("Invalid balance!")
        self._balance = new_bal  
class Customer:
    def __init__(self, name, new_bal):
        self.name = name
        if new_bal < 0:
           raise ValueError("Invalid balance!")
        self._balance = new_bal  
    
    # Add a decorated balance() method returning _balance
    @property
    def balance(self):
        return self._balance
class Customer:
    def __init__(self, name, new_bal):
        self.name = name
        if new_bal < 0:
           raise ValueError("Invalid balance!")
        self._balance = new_bal  

    # Add a decorated balance() method returning _balance        
    @property
    def balance(self):
        return self._balance
     
    # Add a setter balance() method
    @balance.setter
    def balance(self, balance):
        # Validate the parameter value
        if balance < 0:
            raise ValueError("Invalid salary")
        self._balance = balance

        
        # Print "Setter method is called"
        print("Setter method is called")
class Customer:
    def __init__(self, name, new_bal):
        self.name = name
        if new_bal < 0:
           raise ValueError("Invalid balance!")
        self._balance = new_bal  

    # Add a decorated balance() method returning _balance        
    @property
    def balance(self):
        return self._balance

    # Add a setter balance() method
    @balance.setter
    def balance(self, new_bal):
        # Validate the parameter value
        if new_bal < 0:
           raise ValueError("Invalid balance!")
        self._balance = new_bal
        print("Setter method called")

# Create a Customer        
cust = Customer("Belinda Lutz", 2000)

# Assign 3000 to the balance property
cust.balance = 3000

# Print the balance property
print(cust.balance)
Setter method called
3000

Great start! Now the user of your Customer class won’t be able to assign arbitrary values to the customers’ balance. You could also add a custom getter method (with a decorator @balance.getter) that returns a value and gets executed every time the attribute is accessed.

Read-only properties

The LoggedDF class from Chapter 2 was an extension of the pandas DataFrame class that had an additional created_at attribute that stored the timestamp when the DataFrame was created, so that the user could see how out-of-date the data is.

But that class wasn’t very useful: we could just assign any value to created_at after the DataFrame was created, thus defeating the whole point of the attribute! Now, using properties, we can make the attribute read-only.

The LoggedDF class from Chapter 2 is available for you in the script pane.

import pandas as pd
from datetime import datetime

# LoggedDF class definition from Chapter 2
class LoggedDF(pd.DataFrame):
    def __init__(self, *args, **kwargs):
        pd.DataFrame.__init__(self, *args, **kwargs)
        self.created_at = datetime.today()

    def to_csv(self, *args, **kwargs):
        temp = self.copy()
        temp["created_at"] = self.created_at
        pd.DataFrame.to_csv(temp, *args, **kwargs)   

# Instantiate a LoggedDF called ldf
ldf = LoggedDF({"col1": [1,2], "col2":[3,4]}) 

# Assign a new value to ldf's created_at attribute and print
ldf.created_at = '2035-07-13'
print(ldf.created_at)
2035-07-13
Create an internal attribute called _created_at to turn created_at into a read-only attribute.
Modify the class to use the internal attribute, _created_at, in place of created_at.
import pandas as pd
from datetime import datetime

# MODIFY the class to use _created_at instead of created_at
class LoggedDF(pd.DataFrame):
    def __init__(self, *args, **kwargs):
        pd.DataFrame.__init__(self, *args, **kwargs)
        self._created_at = datetime.today()
    
    def to_csv(self, *args, **kwargs):
        temp = self.copy()
        temp["created_at"] = self._created_at
        pd.DataFrame.to_csv(temp, *args, **kwargs)   
    
    # Add a read-only property: _created_at
    @property  
    def created_at(self, created_at):
        return self._created_at

# Instantiate a LoggedDF called ldf
ldf = LoggedDF({"col1": [1,2], "col2":[3,4]}) 
# now try setting
# ldf.created_at = '2035-07-13'
# throws AttributeError: can't set attribute

You’ve put it all together! Notice that the to_csv() method in the original class was using the original created_at attribute. After converting the attribute into a property, you could replace the call to self.created_at with the call to the internal attribute that’s attached to the property, or you could keep it as self.created_at, in which case you’ll now be accessing the property. Either way works!

What’s Next

  • Functionality
    • Multiple inheritance and mixin classes
    • Overriding builtin operators like +
    • __getattr__() and __setattr__()
    • Custom iterators
    • Abstract base classes
    • Dataclasses (new in Python 3.7)
  • Design
    • SOLID principles
      • Single-responsibility principle
      • Open-closed principle
      • Liskov substitution principle
      • Interface segregation principle
      • Dependency inversion principle
    • Design patterns