efro.dataclassio
Functionality for importing, exporting, and validating dataclasses.
This allows complex nested dataclasses to be flattened to json-compatible data and restored from said data. It also gracefully handles and preserves unrecognized attribute data, allowing older clients to interact with newer data formats in a nondestructive manner.
1# Released under the MIT License. See LICENSE for details. 2# 3"""Functionality for importing, exporting, and validating dataclasses. 4 5This allows complex nested dataclasses to be flattened to json-compatible 6data and restored from said data. It also gracefully handles and preserves 7unrecognized attribute data, allowing older clients to interact with newer 8data formats in a nondestructive manner. 9""" 10 11from __future__ import annotations 12 13from efro.util import set_canonical_module_names 14from efro.dataclassio._base import ( 15 Codec, 16 IOAttrs, 17 IOExtendedData, 18 IOMultiType, 19 EXTRA_ATTRS_ATTR, 20) 21from efro.dataclassio._prep import ( 22 ioprep, 23 ioprepped, 24 will_ioprep, 25 is_ioprepped_dataclass, 26) 27from efro.dataclassio._pathcapture import DataclassFieldLookup 28from efro.dataclassio._api import ( 29 JsonStyle, 30 dataclass_to_dict, 31 dataclass_to_json, 32 dataclass_from_dict, 33 dataclass_from_json, 34 dataclass_validate, 35 dataclass_hash, 36) 37 38__all__ = [ 39 'Codec', 40 'DataclassFieldLookup', 41 'EXTRA_ATTRS_ATTR', 42 'IOAttrs', 43 'IOExtendedData', 44 'IOMultiType', 45 'JsonStyle', 46 'dataclass_from_dict', 47 'dataclass_from_json', 48 'dataclass_to_dict', 49 'dataclass_to_json', 50 'dataclass_validate', 51 'dataclass_hash', 52 'ioprep', 53 'ioprepped', 54 'is_ioprepped_dataclass', 55 'will_ioprep', 56] 57 58# Have these things present themselves cleanly as 'thismodule.SomeClass' 59# instead of 'thismodule._internalmodule.SomeClass' 60set_canonical_module_names(globals())
29class Codec(Enum): 30 """Specifies expected data format exported to or imported from.""" 31 32 # Use only types that will translate cleanly to/from json: lists, 33 # dicts with str keys, bools, ints, floats, and None. 34 JSON = 'json' 35 36 # Mostly like JSON but passes bytes and datetime objects through 37 # as-is instead of converting them to json-friendly types. 38 FIRESTORE = 'firestore'
Specifies expected data format exported to or imported from.
Inherited Members
- enum.Enum
- name
- value
61class DataclassFieldLookup(Generic[T]): 62 """Get info about nested dataclass fields in type-safe way.""" 63 64 def __init__(self, cls: type[T]) -> None: 65 self.cls = cls 66 67 def path(self, callback: Callable[[T], Any]) -> str: 68 """Look up a path on child dataclass fields. 69 70 example: 71 DataclassFieldLookup(MyType).path(lambda obj: obj.foo.bar) 72 73 The above example will return the string 'foo.bar' or something 74 like 'f.b' if the dataclasses have custom storage names set. 75 It will also be static-type-checked, triggering an error if 76 MyType.foo.bar is not a valid path. Note, however, that the 77 callback technically allows any return value but only nested 78 dataclasses and their fields will succeed. 79 """ 80 81 # We tell the type system that we are returning an instance 82 # of our class, which allows it to perform type checking on 83 # member lookups. In reality, however, we are providing a 84 # special object which captures path lookups, so we can build 85 # a string from them. 86 if not TYPE_CHECKING: 87 out = callback(_PathCapture(self.cls)) 88 if not isinstance(out, _PathCapture): 89 raise TypeError( 90 f'Expected a valid path under' 91 f' the provided object; got a {type(out)}.' 92 ) 93 return out.path 94 return '' 95 96 def paths(self, callback: Callable[[T], list[Any]]) -> list[str]: 97 """Look up multiple paths on child dataclass fields. 98 99 Functionality is identical to path() but for multiple paths at once. 100 101 example: 102 DataclassFieldLookup(MyType).paths(lambda obj: [obj.foo, obj.bar]) 103 """ 104 outvals: list[str] = [] 105 if not TYPE_CHECKING: 106 outs = callback(_PathCapture(self.cls)) 107 assert isinstance(outs, list) 108 for out in outs: 109 if not isinstance(out, _PathCapture): 110 raise TypeError( 111 f'Expected a valid path under' 112 f' the provided object; got a {type(out)}.' 113 ) 114 outvals.append(out.path) 115 return outvals
Get info about nested dataclass fields in type-safe way.
67 def path(self, callback: Callable[[T], Any]) -> str: 68 """Look up a path on child dataclass fields. 69 70 example: 71 DataclassFieldLookup(MyType).path(lambda obj: obj.foo.bar) 72 73 The above example will return the string 'foo.bar' or something 74 like 'f.b' if the dataclasses have custom storage names set. 75 It will also be static-type-checked, triggering an error if 76 MyType.foo.bar is not a valid path. Note, however, that the 77 callback technically allows any return value but only nested 78 dataclasses and their fields will succeed. 79 """ 80 81 # We tell the type system that we are returning an instance 82 # of our class, which allows it to perform type checking on 83 # member lookups. In reality, however, we are providing a 84 # special object which captures path lookups, so we can build 85 # a string from them. 86 if not TYPE_CHECKING: 87 out = callback(_PathCapture(self.cls)) 88 if not isinstance(out, _PathCapture): 89 raise TypeError( 90 f'Expected a valid path under' 91 f' the provided object; got a {type(out)}.' 92 ) 93 return out.path 94 return ''
Look up a path on child dataclass fields.
example: DataclassFieldLookup(MyType).path(lambda obj: obj.foo.bar)
The above example will return the string 'foo.bar' or something like 'f.b' if the dataclasses have custom storage names set. It will also be static-type-checked, triggering an error if MyType.foo.bar is not a valid path. Note, however, that the callback technically allows any return value but only nested dataclasses and their fields will succeed.
96 def paths(self, callback: Callable[[T], list[Any]]) -> list[str]: 97 """Look up multiple paths on child dataclass fields. 98 99 Functionality is identical to path() but for multiple paths at once. 100 101 example: 102 DataclassFieldLookup(MyType).paths(lambda obj: [obj.foo, obj.bar]) 103 """ 104 outvals: list[str] = [] 105 if not TYPE_CHECKING: 106 outs = callback(_PathCapture(self.cls)) 107 assert isinstance(outs, list) 108 for out in outs: 109 if not isinstance(out, _PathCapture): 110 raise TypeError( 111 f'Expected a valid path under' 112 f' the provided object; got a {type(out)}.' 113 ) 114 outvals.append(out.path) 115 return outvals
Look up multiple paths on child dataclass fields.
Functionality is identical to path() but for multiple paths at once.
example: DataclassFieldLookup(MyType).paths(lambda obj: [obj.foo, obj.bar])
138class IOAttrs: 139 """For specifying io behavior in annotations. 140 141 'storagename', if passed, is the name used when storing to json/etc. 142 'store_default' can be set to False to avoid writing values when equal 143 to the default value. Note that this requires the dataclass field 144 to define a default or default_factory or for its IOAttrs to 145 define a soft_default value. 146 'whole_days', if True, requires datetime values to be exactly on day 147 boundaries (see efro.util.utc_today()). 148 'whole_hours', if True, requires datetime values to lie exactly on hour 149 boundaries (see efro.util.utc_this_hour()). 150 'whole_minutes', if True, requires datetime values to lie exactly on minute 151 boundaries (see efro.util.utc_this_minute()). 152 'soft_default', if passed, injects a default value into dataclass 153 instantiation when the field is not present in the input data. 154 This allows dataclasses to add new non-optional fields while 155 gracefully 'upgrading' old data. Note that when a soft_default is 156 present it will take precedence over field defaults when determining 157 whether to store a value for a field with store_default=False 158 (since the soft_default value is what we'll get when reading that 159 same data back in when the field is omitted). 160 'soft_default_factory' is similar to 'default_factory' in dataclass 161 fields; it should be used instead of 'soft_default' for mutable types 162 such as lists to prevent a single default object from unintentionally 163 changing over time. 164 """ 165 166 # A sentinel object to detect if a parameter is supplied or not. Use 167 # a class to give it a better repr. 168 class _MissingType: 169 pass 170 171 MISSING = _MissingType() 172 173 storagename: str | None = None 174 store_default: bool = True 175 whole_days: bool = False 176 whole_hours: bool = False 177 whole_minutes: bool = False 178 soft_default: Any = MISSING 179 soft_default_factory: Callable[[], Any] | _MissingType = MISSING 180 181 def __init__( 182 self, 183 storagename: str | None = storagename, 184 store_default: bool = store_default, 185 whole_days: bool = whole_days, 186 whole_hours: bool = whole_hours, 187 whole_minutes: bool = whole_minutes, 188 soft_default: Any = MISSING, 189 soft_default_factory: Callable[[], Any] | _MissingType = MISSING, 190 ): 191 # Only store values that differ from class defaults to keep 192 # our instances nice and lean. 193 cls = type(self) 194 if storagename != cls.storagename: 195 self.storagename = storagename 196 if store_default != cls.store_default: 197 self.store_default = store_default 198 if whole_days != cls.whole_days: 199 self.whole_days = whole_days 200 if whole_hours != cls.whole_hours: 201 self.whole_hours = whole_hours 202 if whole_minutes != cls.whole_minutes: 203 self.whole_minutes = whole_minutes 204 if soft_default is not cls.soft_default: 205 # Do what dataclasses does with its default types and 206 # tell the user to use factory for mutable ones. 207 if isinstance(soft_default, (list, dict, set)): 208 raise ValueError( 209 f'mutable {type(soft_default)} is not allowed' 210 f' for soft_default; use soft_default_factory.' 211 ) 212 self.soft_default = soft_default 213 if soft_default_factory is not cls.soft_default_factory: 214 self.soft_default_factory = soft_default_factory 215 if self.soft_default is not cls.soft_default: 216 raise ValueError( 217 'Cannot set both soft_default and soft_default_factory' 218 ) 219 220 def validate_for_field(self, cls: type, field: dataclasses.Field) -> None: 221 """Ensure the IOAttrs instance is ok to use with the provided field.""" 222 223 # Turning off store_default requires the field to have either 224 # a default or a default_factory or for us to have soft equivalents. 225 226 if not self.store_default: 227 field_default_factory: Any = field.default_factory 228 if ( 229 field_default_factory is dataclasses.MISSING 230 and field.default is dataclasses.MISSING 231 and self.soft_default is self.MISSING 232 and self.soft_default_factory is self.MISSING 233 ): 234 raise TypeError( 235 f'Field {field.name} of {cls} has' 236 f' neither a default nor a default_factory' 237 f' and IOAttrs contains neither a soft_default' 238 f' nor a soft_default_factory;' 239 f' store_default=False cannot be set for it.' 240 ) 241 242 def validate_datetime( 243 self, value: datetime.datetime, fieldpath: str 244 ) -> None: 245 """Ensure a datetime value meets our value requirements.""" 246 if self.whole_days: 247 if any( 248 x != 0 249 for x in ( 250 value.hour, 251 value.minute, 252 value.second, 253 value.microsecond, 254 ) 255 ): 256 raise ValueError( 257 f'Value {value} at {fieldpath} is not a whole day.' 258 ) 259 elif self.whole_hours: 260 if any( 261 x != 0 for x in (value.minute, value.second, value.microsecond) 262 ): 263 raise ValueError( 264 f'Value {value} at {fieldpath}' f' is not a whole hour.' 265 ) 266 elif self.whole_minutes: 267 if any(x != 0 for x in (value.second, value.microsecond)): 268 raise ValueError( 269 f'Value {value} at {fieldpath}' f' is not a whole minute.' 270 )
For specifying io behavior in annotations.
'storagename', if passed, is the name used when storing to json/etc. 'store_default' can be set to False to avoid writing values when equal to the default value. Note that this requires the dataclass field to define a default or default_factory or for its IOAttrs to define a soft_default value. 'whole_days', if True, requires datetime values to be exactly on day boundaries (see efro.util.utc_today()). 'whole_hours', if True, requires datetime values to lie exactly on hour boundaries (see efro.util.utc_this_hour()). 'whole_minutes', if True, requires datetime values to lie exactly on minute boundaries (see efro.util.utc_this_minute()). 'soft_default', if passed, injects a default value into dataclass instantiation when the field is not present in the input data. This allows dataclasses to add new non-optional fields while gracefully 'upgrading' old data. Note that when a soft_default is present it will take precedence over field defaults when determining whether to store a value for a field with store_default=False (since the soft_default value is what we'll get when reading that same data back in when the field is omitted). 'soft_default_factory' is similar to 'default_factory' in dataclass fields; it should be used instead of 'soft_default' for mutable types such as lists to prevent a single default object from unintentionally changing over time.
181 def __init__( 182 self, 183 storagename: str | None = storagename, 184 store_default: bool = store_default, 185 whole_days: bool = whole_days, 186 whole_hours: bool = whole_hours, 187 whole_minutes: bool = whole_minutes, 188 soft_default: Any = MISSING, 189 soft_default_factory: Callable[[], Any] | _MissingType = MISSING, 190 ): 191 # Only store values that differ from class defaults to keep 192 # our instances nice and lean. 193 cls = type(self) 194 if storagename != cls.storagename: 195 self.storagename = storagename 196 if store_default != cls.store_default: 197 self.store_default = store_default 198 if whole_days != cls.whole_days: 199 self.whole_days = whole_days 200 if whole_hours != cls.whole_hours: 201 self.whole_hours = whole_hours 202 if whole_minutes != cls.whole_minutes: 203 self.whole_minutes = whole_minutes 204 if soft_default is not cls.soft_default: 205 # Do what dataclasses does with its default types and 206 # tell the user to use factory for mutable ones. 207 if isinstance(soft_default, (list, dict, set)): 208 raise ValueError( 209 f'mutable {type(soft_default)} is not allowed' 210 f' for soft_default; use soft_default_factory.' 211 ) 212 self.soft_default = soft_default 213 if soft_default_factory is not cls.soft_default_factory: 214 self.soft_default_factory = soft_default_factory 215 if self.soft_default is not cls.soft_default: 216 raise ValueError( 217 'Cannot set both soft_default and soft_default_factory' 218 )
220 def validate_for_field(self, cls: type, field: dataclasses.Field) -> None: 221 """Ensure the IOAttrs instance is ok to use with the provided field.""" 222 223 # Turning off store_default requires the field to have either 224 # a default or a default_factory or for us to have soft equivalents. 225 226 if not self.store_default: 227 field_default_factory: Any = field.default_factory 228 if ( 229 field_default_factory is dataclasses.MISSING 230 and field.default is dataclasses.MISSING 231 and self.soft_default is self.MISSING 232 and self.soft_default_factory is self.MISSING 233 ): 234 raise TypeError( 235 f'Field {field.name} of {cls} has' 236 f' neither a default nor a default_factory' 237 f' and IOAttrs contains neither a soft_default' 238 f' nor a soft_default_factory;' 239 f' store_default=False cannot be set for it.' 240 )
Ensure the IOAttrs instance is ok to use with the provided field.
242 def validate_datetime( 243 self, value: datetime.datetime, fieldpath: str 244 ) -> None: 245 """Ensure a datetime value meets our value requirements.""" 246 if self.whole_days: 247 if any( 248 x != 0 249 for x in ( 250 value.hour, 251 value.minute, 252 value.second, 253 value.microsecond, 254 ) 255 ): 256 raise ValueError( 257 f'Value {value} at {fieldpath} is not a whole day.' 258 ) 259 elif self.whole_hours: 260 if any( 261 x != 0 for x in (value.minute, value.second, value.microsecond) 262 ): 263 raise ValueError( 264 f'Value {value} at {fieldpath}' f' is not a whole hour.' 265 ) 266 elif self.whole_minutes: 267 if any(x != 0 for x in (value.second, value.microsecond)): 268 raise ValueError( 269 f'Value {value} at {fieldpath}' f' is not a whole minute.' 270 )
Ensure a datetime value meets our value requirements.
41class IOExtendedData: 42 """A class that data types can inherit from for extra functionality.""" 43 44 def will_output(self) -> None: 45 """Called before data is sent to an outputter. 46 47 Can be overridden to validate or filter data before 48 sending it on its way. 49 """ 50 51 @classmethod 52 def will_input(cls, data: dict) -> None: 53 """Called on raw data before a class instance is created from it. 54 55 Can be overridden to migrate old data formats to new, etc. 56 """ 57 58 def did_input(self) -> None: 59 """Called on a class instance after created from data. 60 61 Can be useful to correct values from the db, etc. in the 62 type-safe form. 63 """ 64 65 # pylint: disable=useless-return 66 67 @classmethod 68 def handle_input_error(cls, exc: Exception) -> Self | None: 69 """Called when an error occurs during input decoding. 70 71 This allows a type to optionally return substitute data 72 to be used in place of the failed decode. If it returns 73 None, the original exception is re-raised. 74 75 It is generally a bad idea to apply catch-alls such as this, 76 as it can lead to silent data loss. This should only be used 77 in specific cases such as user settings where an occasional 78 reset is harmless and is preferable to keeping all contained 79 enums and other values backward compatible indefinitely. 80 """ 81 del exc # Unused. 82 83 # By default we let things fail. 84 return None 85 86 # pylint: enable=useless-return
A class that data types can inherit from for extra functionality.
44 def will_output(self) -> None: 45 """Called before data is sent to an outputter. 46 47 Can be overridden to validate or filter data before 48 sending it on its way. 49 """
Called before data is sent to an outputter.
Can be overridden to validate or filter data before sending it on its way.
51 @classmethod 52 def will_input(cls, data: dict) -> None: 53 """Called on raw data before a class instance is created from it. 54 55 Can be overridden to migrate old data formats to new, etc. 56 """
Called on raw data before a class instance is created from it.
Can be overridden to migrate old data formats to new, etc.
58 def did_input(self) -> None: 59 """Called on a class instance after created from data. 60 61 Can be useful to correct values from the db, etc. in the 62 type-safe form. 63 """
Called on a class instance after created from data.
Can be useful to correct values from the db, etc. in the type-safe form.
67 @classmethod 68 def handle_input_error(cls, exc: Exception) -> Self | None: 69 """Called when an error occurs during input decoding. 70 71 This allows a type to optionally return substitute data 72 to be used in place of the failed decode. If it returns 73 None, the original exception is re-raised. 74 75 It is generally a bad idea to apply catch-alls such as this, 76 as it can lead to silent data loss. This should only be used 77 in specific cases such as user settings where an occasional 78 reset is harmless and is preferable to keeping all contained 79 enums and other values backward compatible indefinitely. 80 """ 81 del exc # Unused. 82 83 # By default we let things fail. 84 return None
Called when an error occurs during input decoding.
This allows a type to optionally return substitute data to be used in place of the failed decode. If it returns None, the original exception is re-raised.
It is generally a bad idea to apply catch-alls such as this, as it can lead to silent data loss. This should only be used in specific cases such as user settings where an occasional reset is harmless and is preferable to keeping all contained enums and other values backward compatible indefinitely.
92class IOMultiType(Generic[EnumT]): 93 """A base class for types that can map to multiple dataclass types. 94 95 This enables usage of high level base classes (for example 96 a 'Message' type) in annotations, with dataclassio automatically 97 serializing & deserializing dataclass subclasses based on their 98 type ('MessagePing', 'MessageChat', etc.) 99 100 Standard usage involves creating a class which inherits from this 101 one which acts as a 'registry', and then creating dataclass classes 102 inheriting from that registry class. Dataclassio will then do the 103 right thing when that registry class is used in type annotations. 104 105 See tests/test_efro/test_dataclassio.py for examples. 106 """ 107 108 @classmethod 109 def get_type(cls, type_id: EnumT) -> type[Self]: 110 """Return a specific subclass given a type-id.""" 111 raise NotImplementedError() 112 113 @classmethod 114 def get_type_id(cls) -> EnumT: 115 """Return the type-id for this subclass.""" 116 raise NotImplementedError() 117 118 @classmethod 119 def get_type_id_type(cls) -> type[EnumT]: 120 """Return the Enum type this class uses as its type-id.""" 121 out: type[EnumT] = cls.__orig_bases__[0].__args__[0] # type: ignore 122 assert issubclass(out, Enum) 123 return out 124 125 @classmethod 126 def get_type_id_storage_name(cls) -> str: 127 """Return the key used to store type id in serialized data. 128 129 The default is an obscure value so that it does not conflict 130 with members of individual type attrs, but in some cases one 131 might prefer to serialize it to something simpler like 'type' 132 by overriding this call. One just needs to make sure that no 133 encompassed types serialize anything to 'type' themself. 134 """ 135 return '_dciotype'
A base class for types that can map to multiple dataclass types.
This enables usage of high level base classes (for example a 'Message' type) in annotations, with dataclassio automatically serializing & deserializing dataclass subclasses based on their type ('MessagePing', 'MessageChat', etc.)
Standard usage involves creating a class which inherits from this one which acts as a 'registry', and then creating dataclass classes inheriting from that registry class. Dataclassio will then do the right thing when that registry class is used in type annotations.
See tests/test_efro/test_dataclassio.py for examples.
108 @classmethod 109 def get_type(cls, type_id: EnumT) -> type[Self]: 110 """Return a specific subclass given a type-id.""" 111 raise NotImplementedError()
Return a specific subclass given a type-id.
113 @classmethod 114 def get_type_id(cls) -> EnumT: 115 """Return the type-id for this subclass.""" 116 raise NotImplementedError()
Return the type-id for this subclass.
118 @classmethod 119 def get_type_id_type(cls) -> type[EnumT]: 120 """Return the Enum type this class uses as its type-id.""" 121 out: type[EnumT] = cls.__orig_bases__[0].__args__[0] # type: ignore 122 assert issubclass(out, Enum) 123 return out
Return the Enum type this class uses as its type-id.
125 @classmethod 126 def get_type_id_storage_name(cls) -> str: 127 """Return the key used to store type id in serialized data. 128 129 The default is an obscure value so that it does not conflict 130 with members of individual type attrs, but in some cases one 131 might prefer to serialize it to something simpler like 'type' 132 by overriding this call. One just needs to make sure that no 133 encompassed types serialize anything to 'type' themself. 134 """ 135 return '_dciotype'
Return the key used to store type id in serialized data.
The default is an obscure value so that it does not conflict with members of individual type attrs, but in some cases one might prefer to serialize it to something simpler like 'type' by overriding this call. One just needs to make sure that no encompassed types serialize anything to 'type' themself.
28class JsonStyle(Enum): 29 """Different style types for json.""" 30 31 # Single line, no spaces, no sorting. Not deterministic. 32 # Use this where speed is more important than determinism. 33 FAST = 'fast' 34 35 # Single line, no spaces, sorted keys. Deterministic. 36 # Use this when output may be hashed or compared for equality. 37 SORTED = 'sorted' 38 39 # Multiple lines, spaces, sorted keys. Deterministic. 40 # Use this for pretty human readable output. 41 PRETTY = 'pretty'
Different style types for json.
Inherited Members
- enum.Enum
- name
- value
100def dataclass_from_dict( 101 cls: type[T], 102 values: dict, 103 codec: Codec = Codec.JSON, 104 coerce_to_float: bool = True, 105 allow_unknown_attrs: bool = True, 106 discard_unknown_attrs: bool = False, 107) -> T: 108 """Given a dict, return a dataclass of a given type. 109 110 The dict must be formatted to match the specified codec (generally 111 json-friendly object types). This means that sequence values such as 112 tuples or sets should be passed as lists, enums should be passed as 113 their associated values, nested dataclasses should be passed as dicts, 114 etc. 115 116 All values are checked to ensure their types/values are valid. 117 118 Data for attributes of type Any will be checked to ensure they match 119 types supported directly by json. This does not include types such 120 as tuples which are implicitly translated by Python's json module 121 (as this would break the ability to do a lossless round-trip with 122 data). 123 124 If coerce_to_float is True, int values passed for float typed fields 125 will be converted to float values. Otherwise, a TypeError is raised. 126 127 If `allow_unknown_attrs` is False, AttributeErrors will be raised for 128 attributes present in the dict but not on the data class. Otherwise, 129 they will be preserved as part of the instance and included if it is 130 exported back to a dict, unless `discard_unknown_attrs` is True, in 131 which case they will simply be discarded. 132 """ 133 val = _Inputter( 134 cls, 135 codec=codec, 136 coerce_to_float=coerce_to_float, 137 allow_unknown_attrs=allow_unknown_attrs, 138 discard_unknown_attrs=discard_unknown_attrs, 139 ).run(values) 140 assert isinstance(val, cls) 141 return val
Given a dict, return a dataclass of a given type.
The dict must be formatted to match the specified codec (generally json-friendly object types). This means that sequence values such as tuples or sets should be passed as lists, enums should be passed as their associated values, nested dataclasses should be passed as dicts, etc.
All values are checked to ensure their types/values are valid.
Data for attributes of type Any will be checked to ensure they match types supported directly by json. This does not include types such as tuples which are implicitly translated by Python's json module (as this would break the ability to do a lossless round-trip with data).
If coerce_to_float is True, int values passed for float typed fields will be converted to float values. Otherwise, a TypeError is raised.
If allow_unknown_attrs
is False, AttributeErrors will be raised for
attributes present in the dict but not on the data class. Otherwise,
they will be preserved as part of the instance and included if it is
exported back to a dict, unless discard_unknown_attrs
is True, in
which case they will simply be discarded.
144def dataclass_from_json( 145 cls: type[T], 146 json_str: str, 147 coerce_to_float: bool = True, 148 allow_unknown_attrs: bool = True, 149 discard_unknown_attrs: bool = False, 150) -> T: 151 """Return a dataclass instance given a json string. 152 153 Basically dataclass_from_dict(json.loads(...)) 154 """ 155 156 return dataclass_from_dict( 157 cls=cls, 158 values=json.loads(json_str), 159 coerce_to_float=coerce_to_float, 160 allow_unknown_attrs=allow_unknown_attrs, 161 discard_unknown_attrs=discard_unknown_attrs, 162 )
Return a dataclass instance given a json string.
Basically dataclass_from_dict(json.loads(...))
44def dataclass_to_dict( 45 obj: Any, 46 codec: Codec = Codec.JSON, 47 coerce_to_float: bool = True, 48 discard_extra_attrs: bool = False, 49) -> dict: 50 """Given a dataclass object, return a json-friendly dict. 51 52 All values will be checked to ensure they match the types specified 53 on fields. Note that a limited set of types and data configurations is 54 supported. 55 56 Values with type Any will be checked to ensure they match types supported 57 directly by json. This does not include types such as tuples which are 58 implicitly translated by Python's json module (as this would break 59 the ability to do a lossless round-trip with data). 60 61 If coerce_to_float is True, integer values present on float typed fields 62 will be converted to float in the dict output. If False, a TypeError 63 will be triggered. 64 """ 65 66 out = _Outputter( 67 obj, 68 create=True, 69 codec=codec, 70 coerce_to_float=coerce_to_float, 71 discard_extra_attrs=discard_extra_attrs, 72 ).run() 73 assert isinstance(out, dict) 74 return out
Given a dataclass object, return a json-friendly dict.
All values will be checked to ensure they match the types specified on fields. Note that a limited set of types and data configurations is supported.
Values with type Any will be checked to ensure they match types supported directly by json. This does not include types such as tuples which are implicitly translated by Python's json module (as this would break the ability to do a lossless round-trip with data).
If coerce_to_float is True, integer values present on float typed fields will be converted to float in the dict output. If False, a TypeError will be triggered.
77def dataclass_to_json( 78 obj: Any, 79 coerce_to_float: bool = True, 80 pretty: bool = False, 81 sort_keys: bool | None = None, 82) -> str: 83 """Utility function; return a json string from a dataclass instance. 84 85 Basically json.dumps(dataclass_to_dict(...)). 86 By default, keys are sorted for pretty output and not otherwise, but 87 this can be overridden by supplying a value for the 'sort_keys' arg. 88 """ 89 90 jdict = dataclass_to_dict( 91 obj=obj, coerce_to_float=coerce_to_float, codec=Codec.JSON 92 ) 93 if sort_keys is None: 94 sort_keys = pretty 95 if pretty: 96 return json.dumps(jdict, indent=2, sort_keys=sort_keys) 97 return json.dumps(jdict, separators=(',', ':'), sort_keys=sort_keys)
Utility function; return a json string from a dataclass instance.
Basically json.dumps(dataclass_to_dict(...)). By default, keys are sorted for pretty output and not otherwise, but this can be overridden by supplying a value for the 'sort_keys' arg.
165def dataclass_validate( 166 obj: Any, 167 coerce_to_float: bool = True, 168 codec: Codec = Codec.JSON, 169 discard_extra_attrs: bool = False, 170) -> None: 171 """Ensure that values in a dataclass instance are the correct types.""" 172 173 # Simply run an output pass but tell it not to generate data; 174 # only run validation. 175 _Outputter( 176 obj, 177 create=False, 178 codec=codec, 179 coerce_to_float=coerce_to_float, 180 discard_extra_attrs=discard_extra_attrs, 181 ).run()
Ensure that values in a dataclass instance are the correct types.
184def dataclass_hash(obj: Any, coerce_to_float: bool = True) -> str: 185 """Calculate a hash for the provided dataclass. 186 187 Basically this emits json for the dataclass (with keys sorted 188 to keep things deterministic) and hashes the resulting string. 189 """ 190 import hashlib 191 from base64 import urlsafe_b64encode 192 193 json_dict = dataclass_to_dict( 194 obj, codec=Codec.JSON, coerce_to_float=coerce_to_float 195 ) 196 197 # Need to sort keys to keep things deterministic. 198 json_str = json.dumps(json_dict, separators=(',', ':'), sort_keys=True) 199 200 sha = hashlib.sha256() 201 sha.update(json_str.encode()) 202 203 # Go with urlsafe base64 instead of the usual hex to save some 204 # space, and kill those ugly padding chars at the end. 205 return urlsafe_b64encode(sha.digest()).decode().strip('=')
Calculate a hash for the provided dataclass.
Basically this emits json for the dataclass (with keys sorted to keep things deterministic) and hashes the resulting string.
47def ioprep(cls: type, globalns: dict | None = None) -> None: 48 """Prep a dataclass type for use with this module's functionality. 49 50 Prepping ensures that all types contained in a data class as well as 51 the usage of said types are supported by this module and pre-builds 52 necessary constructs needed for encoding/decoding/etc. 53 54 Prepping will happen on-the-fly as needed, but a warning will be 55 emitted in such cases, as it is better to explicitly prep all used types 56 early in a process to ensure any invalid types or configuration are caught 57 immediately. 58 59 Prepping a dataclass involves evaluating its type annotations, which, 60 as of PEP 563, are stored simply as strings. This evaluation is done 61 with localns set to the class dict (so that types defined in the class 62 can be used) and globalns set to the containing module's class. 63 It is possible to override globalns for special cases such as when 64 prepping happens as part of an execed string instead of within a 65 module. 66 """ 67 PrepSession(explicit=True, globalns=globalns).prep_dataclass( 68 cls, recursion_level=0 69 )
Prep a dataclass type for use with this module's functionality.
Prepping ensures that all types contained in a data class as well as the usage of said types are supported by this module and pre-builds necessary constructs needed for encoding/decoding/etc.
Prepping will happen on-the-fly as needed, but a warning will be emitted in such cases, as it is better to explicitly prep all used types early in a process to ensure any invalid types or configuration are caught immediately.
Prepping a dataclass involves evaluating its type annotations, which, as of PEP 563, are stored simply as strings. This evaluation is done with localns set to the class dict (so that types defined in the class can be used) and globalns set to the containing module's class. It is possible to override globalns for special cases such as when prepping happens as part of an execed string instead of within a module.
72def ioprepped(cls: type[T]) -> type[T]: 73 """Class decorator for easily prepping a dataclass at definition time. 74 75 Note that in some cases it may not be possible to prep a dataclass 76 immediately (such as when its type annotations refer to forward-declared 77 types). In these cases, dataclass_prep() should be explicitly called for 78 the class as soon as possible; ideally at module import time to expose any 79 errors as early as possible in execution. 80 """ 81 ioprep(cls) 82 return cls
Class decorator for easily prepping a dataclass at definition time.
Note that in some cases it may not be possible to prep a dataclass immediately (such as when its type annotations refer to forward-declared types). In these cases, dataclass_prep() should be explicitly called for the class as soon as possible; ideally at module import time to expose any errors as early as possible in execution.
102def is_ioprepped_dataclass(obj: Any) -> bool: 103 """Return whether the obj is an ioprepped dataclass type or instance.""" 104 cls = obj if isinstance(obj, type) else type(obj) 105 return dataclasses.is_dataclass(cls) and hasattr(cls, PREP_ATTR)
Return whether the obj is an ioprepped dataclass type or instance.
85def will_ioprep(cls: type[T]) -> type[T]: 86 """Class decorator hinting that we will prep a class later. 87 88 In some cases (such as recursive types) we cannot use the @ioprepped 89 decorator and must instead call ioprep() explicitly later. However, 90 some of our custom pylint checking behaves differently when the 91 @ioprepped decorator is present, in that case requiring type annotations 92 to be present and not simply forward declared under an "if TYPE_CHECKING" 93 block. (since they are used at runtime). 94 95 The @will_ioprep decorator triggers the same pylint behavior 96 differences as @ioprepped (which are necessary for the later ioprep() call 97 to work correctly) but without actually running any prep itself. 98 """ 99 return cls
Class decorator hinting that we will prep a class later.
In some cases (such as recursive types) we cannot use the @ioprepped decorator and must instead call ioprep() explicitly later. However, some of our custom pylint checking behaves differently when the @ioprepped decorator is present, in that case requiring type annotations to be present and not simply forward declared under an "if TYPE_CHECKING" block. (since they are used at runtime).
The @will_ioprep decorator triggers the same pylint behavior differences as @ioprepped (which are necessary for the later ioprep() call to work correctly) but without actually running any prep itself.