How To Make Your Python Code Prettier With Dataclasses

The Python 3.7 release introduced the feature of dataclasses. Its goal is to simplify the class creation to represent data models. Dataclasses may be compared to a high-level object-oriented data structure.

When you use Python class for this usage, you have to implement some methods such as __init__() and __repr()__. This is cumbersome to repeat the same routine for each class. Hopefully, Dataclasses automatically generated these methods for you.

In this Python tutorial, you will have an overview of how dataclasses work through examples and examine offered possibilities.

Declaring a Python Dataclass

Dataclass declaration needs the assignment of a decorator. In the code snippet below, you have a dataclass usage example for representing GPS coordinates:

from dataclasses import dataclass

class Position:
  lat: float
  lon: float

if __name__ == '__main__':
  position = Position(37.6216, -122.3929)
Dataclass representing GPS coordinates with latitude and longitude

When executing this code, the Position object is printed with latitude and longitude attributes. As mentioned before, no extra __repr__() method  is needed:

$ python
Position(lat=37.6216, lon=-122.3929)

You can compare instantiated dataclass object as any other Python type with the equal operator. No extra __eq__() is needed too:

from dataclasses import dataclass

class Position:
  lat: float
  lon: float

if __name__ == '__main__':
  position = Position(37.6216, -122.3929)
  print(position == Position(37.6216, -122.3929))
$ python

Implementing Methods In a Python Dataclass

As a traditional Python class, you can also implement methods inside a dataclass. In this example, a method to calculate the Harvesine distance in kilometers between two positions is added:  

from dataclasses import dataclass

import math

class Position:
  lat: float
  lon: float

  def distance_to(self, position):
    Calculate harversine distance between two positions
    :param position: other position object
    :return: a float representing distance in kilometers between two positions
    r = 6371.0  # Earth radius in kilometers
    lam1, lam2 = math.radians(self.lon), math.radians(position.lon)
    phi1, phi2 = math.radians(, math.radians(
    delta_lam, delta_phi = lam2 - lam1, phi2 - phi1
    a = math.sin(delta_phi / 2) ** 2 + math.cos(phi1) * math.cos(phi2) * math.sin(delta_lam / 2) ** 2
    return r * (2 * math.atan2(math.sqrt(a), math.sqrt(1 - a)))

if __name__ == '__main__':
  paris = Position(2.3522219, 48.856614)
  san_francisco = Position(37.6216, -122.3929)
$ python

Using Python Dataclass Object As Attributes

A dataclass object is considered as any other Python type. To show you that, the following instantiate a Town object with a position dataclass object:

from dataclasses import dataclass

class Position:
  lat: float
  lon: float

class Town:
  name: str
  position: Position

if __name__ == '__main__':
  paris = Town('Paris', Position(2.3522219, 48.856614))
  san_francisco = Town('San Francisco', Position(37.6216, -122.3929))

Dataclasses and Inheritance

The town class presented in the last section can be simplified. Let's consider that a town is a position. Town objects will inherit from latitude and longitude attributes from the parent Position class:

from dataclasses import dataclass

class Position:
  lat: float
  lon: float

class Town(Position):
  name: str

if __name__ == '__main__':
  paris = Town(2.3522219, 48.856614, 'Paris')
  san_francisco = Town(37.6216, -122.3929, 'San Francisco')

To go even further let's add a new class to distinguish the capital among the towns:

from dataclasses import dataclass

class Position:
  lat: float
  lon: float

class Town(Position):
  name: str

class Capital(Town):

if __name__ == '__main__':
  paris = Capital(2.3522219, 48.856614, 'Paris')
  san_francisco = Town(37.6216, -122.3929, 'San Francisco')
view raw

Dataclass Fields

Dataclass has field() specifier to customize each field of your data. It supports many different parameters. The longitude and latitude units of a position are in degrees:

from dataclasses import dataclass, field

class Position:
  lat: float = field(default=0.0, metadata={'unit': 'degrees'})
  lon: float = field(default=0.0, metadata={'unit': 'degrees'})

class Town(Position):
  # Default arguments cannot be followed by non-default arguments
  name: str = None

if __name__ == '__main__':
  paris = Town(2.3522219, 48.856614, 'Paris')
	san_francisco = Town(37.6216, -122.3929, 'San Francisco')


Dataclass offers immutability option setting using frozen=True. When this flag is enabled, the fields may never change.

Be careful of the nested dataclass containing immutable fields with inheritance.

Town positions are destined to change. The following example shows a Country dataclass which is a collection of the different towns. In this class, a function get_capital filters the capital from the country's towns:

from dataclasses import dataclass, field
from typing import List

class Position:
  lat: float = field(default=0.0, metadata={'unit': 'degrees'})
  lon: float = field(default=0.0, metadata={'unit': 'degrees'})

class Town(Position):
  name: str = None

class Capital(Town):

class Country:
  code: str
  towns: List[Town] = field(default_factory=list)

  def get_capital(self):
      return list(filter(lambda x: isinstance(x, Capital), self.towns)).__getitem__(0)
    except IndexError:
      return None

if __name__ == '__main__':
  paris = Capital(2.3522219, 48.856614, 'Paris')
  san_francisco = Town(37.6216, -122.3929, 'San Francisco')
  washington = Capital(47.751076, -120.740135, 'Washington')
  united_states = Country('US', [san_francisco, washington])


Through multiple funny examples, you have figured out the following points using dataclasses:

  • You do not need to write existing methods for a new class. You can explicitly write them to override the default behavior.
  • A dataclass is not so different than a traditional Python class.
  • You may define immutable objects if you feel it is appropriate to your concerns.
  • Dataclass is an elegant feature to create more comprehensive data models.

Since I’ve discovered this feature, I try to use it the most possible for great readable code! And you?


PEP 557 -- Data Classes
The official home of the Python Programming Language
dataclasses — Data Classes — Python 3.9.4 documentation