efro.dataclassio
Functionality for importing, exporting, and validating dataclasses.
This allows complex nested dataclasses to be flattened to json-compatible data and restored from said data. It also gracefully handles and preserves unrecognized attribute data, allowing older clients to interact with newer data formats in a nondestructive manner.
1# Released under the MIT License. See LICENSE for details. 2# 3"""Functionality for importing, exporting, and validating dataclasses. 4 5This allows complex nested dataclasses to be flattened to json-compatible 6data and restored from said data. It also gracefully handles and preserves 7unrecognized attribute data, allowing older clients to interact with newer 8data formats in a nondestructive manner. 9""" 10 11from __future__ import annotations 12 13from efro.util import set_canonical_module_names 14from efro.dataclassio._base import ( 15 Codec, 16 IOAttrs, 17 IOExtendedData, 18 IOMultiType, 19 EXTRA_ATTRS_ATTR, 20) 21from efro.dataclassio._prep import ( 22 ioprep, 23 ioprepped, 24 will_ioprep, 25 is_ioprepped_dataclass, 26) 27from efro.dataclassio._pathcapture import DataclassFieldLookup 28from efro.dataclassio._api import ( 29 JsonStyle, 30 dataclass_to_dict, 31 dataclass_to_json, 32 dataclass_from_dict, 33 dataclass_from_json, 34 dataclass_validate, 35 dataclass_hash, 36) 37 38__all__ = [ 39 'Codec', 40 'DataclassFieldLookup', 41 'EXTRA_ATTRS_ATTR', 42 'IOAttrs', 43 'IOExtendedData', 44 'IOMultiType', 45 'JsonStyle', 46 'dataclass_from_dict', 47 'dataclass_from_json', 48 'dataclass_to_dict', 49 'dataclass_to_json', 50 'dataclass_validate', 51 'dataclass_hash', 52 'ioprep', 53 'ioprepped', 54 'is_ioprepped_dataclass', 55 'will_ioprep', 56] 57 58# Have these things present themselves cleanly as 'thismodule.SomeClass' 59# instead of 'thismodule._internalmodule.SomeClass' 60set_canonical_module_names(globals())
29class Codec(Enum): 30 """Specifies expected data format exported to or imported from.""" 31 32 # Use only types that will translate cleanly to/from json: lists, 33 # dicts with str keys, bools, ints, floats, and None. 34 JSON = 'json' 35 36 # Mostly like JSON but passes bytes and datetime objects through 37 # as-is instead of converting them to json-friendly types. 38 FIRESTORE = 'firestore'
Specifies expected data format exported to or imported from.
61class DataclassFieldLookup(Generic[T]): 62 """Get info about nested dataclass fields in type-safe way.""" 63 64 def __init__(self, cls: type[T]) -> None: 65 self.cls = cls 66 67 def path(self, callback: Callable[[T], Any]) -> str: 68 """Look up a path on child dataclass fields. 69 70 example: 71 DataclassFieldLookup(MyType).path(lambda obj: obj.foo.bar) 72 73 The above example will return the string 'foo.bar' or something 74 like 'f.b' if the dataclasses have custom storage names set. 75 It will also be static-type-checked, triggering an error if 76 MyType.foo.bar is not a valid path. Note, however, that the 77 callback technically allows any return value but only nested 78 dataclasses and their fields will succeed. 79 """ 80 81 # We tell the type system that we are returning an instance 82 # of our class, which allows it to perform type checking on 83 # member lookups. In reality, however, we are providing a 84 # special object which captures path lookups, so we can build 85 # a string from them. 86 if not TYPE_CHECKING: 87 out = callback(_PathCapture(self.cls)) 88 if not isinstance(out, _PathCapture): 89 raise TypeError( 90 f'Expected a valid path under' 91 f' the provided object; got a {type(out)}.' 92 ) 93 return out.path 94 return '' 95 96 def paths(self, callback: Callable[[T], list[Any]]) -> list[str]: 97 """Look up multiple paths on child dataclass fields. 98 99 Functionality is identical to path() but for multiple paths at once. 100 101 example: 102 DataclassFieldLookup(MyType).paths(lambda obj: [obj.foo, obj.bar]) 103 """ 104 outvals: list[str] = [] 105 if not TYPE_CHECKING: 106 outs = callback(_PathCapture(self.cls)) 107 assert isinstance(outs, list) 108 for out in outs: 109 if not isinstance(out, _PathCapture): 110 raise TypeError( 111 f'Expected a valid path under' 112 f' the provided object; got a {type(out)}.' 113 ) 114 outvals.append(out.path) 115 return outvals
Get info about nested dataclass fields in type-safe way.
67 def path(self, callback: Callable[[T], Any]) -> str: 68 """Look up a path on child dataclass fields. 69 70 example: 71 DataclassFieldLookup(MyType).path(lambda obj: obj.foo.bar) 72 73 The above example will return the string 'foo.bar' or something 74 like 'f.b' if the dataclasses have custom storage names set. 75 It will also be static-type-checked, triggering an error if 76 MyType.foo.bar is not a valid path. Note, however, that the 77 callback technically allows any return value but only nested 78 dataclasses and their fields will succeed. 79 """ 80 81 # We tell the type system that we are returning an instance 82 # of our class, which allows it to perform type checking on 83 # member lookups. In reality, however, we are providing a 84 # special object which captures path lookups, so we can build 85 # a string from them. 86 if not TYPE_CHECKING: 87 out = callback(_PathCapture(self.cls)) 88 if not isinstance(out, _PathCapture): 89 raise TypeError( 90 f'Expected a valid path under' 91 f' the provided object; got a {type(out)}.' 92 ) 93 return out.path 94 return ''
Look up a path on child dataclass fields.
example: DataclassFieldLookup(MyType).path(lambda obj: obj.foo.bar)
The above example will return the string 'foo.bar' or something like 'f.b' if the dataclasses have custom storage names set. It will also be static-type-checked, triggering an error if MyType.foo.bar is not a valid path. Note, however, that the callback technically allows any return value but only nested dataclasses and their fields will succeed.
96 def paths(self, callback: Callable[[T], list[Any]]) -> list[str]: 97 """Look up multiple paths on child dataclass fields. 98 99 Functionality is identical to path() but for multiple paths at once. 100 101 example: 102 DataclassFieldLookup(MyType).paths(lambda obj: [obj.foo, obj.bar]) 103 """ 104 outvals: list[str] = [] 105 if not TYPE_CHECKING: 106 outs = callback(_PathCapture(self.cls)) 107 assert isinstance(outs, list) 108 for out in outs: 109 if not isinstance(out, _PathCapture): 110 raise TypeError( 111 f'Expected a valid path under' 112 f' the provided object; got a {type(out)}.' 113 ) 114 outvals.append(out.path) 115 return outvals
Look up multiple paths on child dataclass fields.
Functionality is identical to path() but for multiple paths at once.
example: DataclassFieldLookup(MyType).paths(lambda obj: [obj.foo, obj.bar])
138class IOAttrs: 139 """For specifying io behavior in annotations. 140 141 'storagename', if passed, is the name used when storing to json/etc. 142 'store_default' can be set to False to avoid writing values when equal 143 to the default value. Note that this requires the dataclass field 144 to define a default or default_factory or for its IOAttrs to 145 define a soft_default value. 146 'whole_days', if True, requires datetime values to be exactly on day 147 boundaries (see efro.util.utc_today()). 148 'whole_hours', if True, requires datetime values to lie exactly on hour 149 boundaries (see efro.util.utc_this_hour()). 150 'whole_minutes', if True, requires datetime values to lie exactly on minute 151 boundaries (see efro.util.utc_this_minute()). 152 'soft_default', if passed, injects a default value into dataclass 153 instantiation when the field is not present in the input data. 154 This allows dataclasses to add new non-optional fields while 155 gracefully 'upgrading' old data. Note that when a soft_default is 156 present it will take precedence over field defaults when determining 157 whether to store a value for a field with store_default=False 158 (since the soft_default value is what we'll get when reading that 159 same data back in when the field is omitted). 160 'soft_default_factory' is similar to 'default_factory' in dataclass 161 fields; it should be used instead of 'soft_default' for mutable types 162 such as lists to prevent a single default object from unintentionally 163 changing over time. 164 """ 165 166 # A sentinel object to detect if a parameter is supplied or not. Use 167 # a class to give it a better repr. 168 class _MissingType: 169 pass 170 171 MISSING = _MissingType() 172 173 storagename: str | None = None 174 store_default: bool = True 175 whole_days: bool = False 176 whole_hours: bool = False 177 whole_minutes: bool = False 178 soft_default: Any = MISSING 179 soft_default_factory: Callable[[], Any] | _MissingType = MISSING 180 181 def __init__( 182 self, 183 storagename: str | None = storagename, 184 *, 185 store_default: bool = store_default, 186 whole_days: bool = whole_days, 187 whole_hours: bool = whole_hours, 188 whole_minutes: bool = whole_minutes, 189 soft_default: Any = MISSING, 190 soft_default_factory: Callable[[], Any] | _MissingType = MISSING, 191 ): 192 # Only store values that differ from class defaults to keep 193 # our instances nice and lean. 194 cls = type(self) 195 if storagename != cls.storagename: 196 self.storagename = storagename 197 if store_default != cls.store_default: 198 self.store_default = store_default 199 if whole_days != cls.whole_days: 200 self.whole_days = whole_days 201 if whole_hours != cls.whole_hours: 202 self.whole_hours = whole_hours 203 if whole_minutes != cls.whole_minutes: 204 self.whole_minutes = whole_minutes 205 if soft_default is not cls.soft_default: 206 # Do what dataclasses does with its default types and 207 # tell the user to use factory for mutable ones. 208 if isinstance(soft_default, (list, dict, set)): 209 raise ValueError( 210 f'mutable {type(soft_default)} is not allowed' 211 f' for soft_default; use soft_default_factory.' 212 ) 213 self.soft_default = soft_default 214 if soft_default_factory is not cls.soft_default_factory: 215 self.soft_default_factory = soft_default_factory 216 if self.soft_default is not cls.soft_default: 217 raise ValueError( 218 'Cannot set both soft_default and soft_default_factory' 219 ) 220 221 def validate_for_field(self, cls: type, field: dataclasses.Field) -> None: 222 """Ensure the IOAttrs instance is ok to use with the provided field.""" 223 224 # Turning off store_default requires the field to have either 225 # a default or a default_factory or for us to have soft equivalents. 226 227 if not self.store_default: 228 field_default_factory: Any = field.default_factory 229 if ( 230 field_default_factory is dataclasses.MISSING 231 and field.default is dataclasses.MISSING 232 and self.soft_default is self.MISSING 233 and self.soft_default_factory is self.MISSING 234 ): 235 raise TypeError( 236 f'Field {field.name} of {cls} has' 237 f' neither a default nor a default_factory' 238 f' and IOAttrs contains neither a soft_default' 239 f' nor a soft_default_factory;' 240 f' store_default=False cannot be set for it.' 241 ) 242 243 def validate_datetime( 244 self, value: datetime.datetime, fieldpath: str 245 ) -> None: 246 """Ensure a datetime value meets our value requirements.""" 247 if self.whole_days: 248 if any( 249 x != 0 250 for x in ( 251 value.hour, 252 value.minute, 253 value.second, 254 value.microsecond, 255 ) 256 ): 257 raise ValueError( 258 f'Value {value} at {fieldpath} is not a whole day.' 259 ) 260 elif self.whole_hours: 261 if any( 262 x != 0 for x in (value.minute, value.second, value.microsecond) 263 ): 264 raise ValueError( 265 f'Value {value} at {fieldpath}' f' is not a whole hour.' 266 ) 267 elif self.whole_minutes: 268 if any(x != 0 for x in (value.second, value.microsecond)): 269 raise ValueError( 270 f'Value {value} at {fieldpath}' f' is not a whole minute.' 271 )
For specifying io behavior in annotations.
'storagename', if passed, is the name used when storing to json/etc. 'store_default' can be set to False to avoid writing values when equal to the default value. Note that this requires the dataclass field to define a default or default_factory or for its IOAttrs to define a soft_default value. 'whole_days', if True, requires datetime values to be exactly on day boundaries (see efro.util.utc_today()). 'whole_hours', if True, requires datetime values to lie exactly on hour boundaries (see efro.util.utc_this_hour()). 'whole_minutes', if True, requires datetime values to lie exactly on minute boundaries (see efro.util.utc_this_minute()). 'soft_default', if passed, injects a default value into dataclass instantiation when the field is not present in the input data. This allows dataclasses to add new non-optional fields while gracefully 'upgrading' old data. Note that when a soft_default is present it will take precedence over field defaults when determining whether to store a value for a field with store_default=False (since the soft_default value is what we'll get when reading that same data back in when the field is omitted). 'soft_default_factory' is similar to 'default_factory' in dataclass fields; it should be used instead of 'soft_default' for mutable types such as lists to prevent a single default object from unintentionally changing over time.
181 def __init__( 182 self, 183 storagename: str | None = storagename, 184 *, 185 store_default: bool = store_default, 186 whole_days: bool = whole_days, 187 whole_hours: bool = whole_hours, 188 whole_minutes: bool = whole_minutes, 189 soft_default: Any = MISSING, 190 soft_default_factory: Callable[[], Any] | _MissingType = MISSING, 191 ): 192 # Only store values that differ from class defaults to keep 193 # our instances nice and lean. 194 cls = type(self) 195 if storagename != cls.storagename: 196 self.storagename = storagename 197 if store_default != cls.store_default: 198 self.store_default = store_default 199 if whole_days != cls.whole_days: 200 self.whole_days = whole_days 201 if whole_hours != cls.whole_hours: 202 self.whole_hours = whole_hours 203 if whole_minutes != cls.whole_minutes: 204 self.whole_minutes = whole_minutes 205 if soft_default is not cls.soft_default: 206 # Do what dataclasses does with its default types and 207 # tell the user to use factory for mutable ones. 208 if isinstance(soft_default, (list, dict, set)): 209 raise ValueError( 210 f'mutable {type(soft_default)} is not allowed' 211 f' for soft_default; use soft_default_factory.' 212 ) 213 self.soft_default = soft_default 214 if soft_default_factory is not cls.soft_default_factory: 215 self.soft_default_factory = soft_default_factory 216 if self.soft_default is not cls.soft_default: 217 raise ValueError( 218 'Cannot set both soft_default and soft_default_factory' 219 )
221 def validate_for_field(self, cls: type, field: dataclasses.Field) -> None: 222 """Ensure the IOAttrs instance is ok to use with the provided field.""" 223 224 # Turning off store_default requires the field to have either 225 # a default or a default_factory or for us to have soft equivalents. 226 227 if not self.store_default: 228 field_default_factory: Any = field.default_factory 229 if ( 230 field_default_factory is dataclasses.MISSING 231 and field.default is dataclasses.MISSING 232 and self.soft_default is self.MISSING 233 and self.soft_default_factory is self.MISSING 234 ): 235 raise TypeError( 236 f'Field {field.name} of {cls} has' 237 f' neither a default nor a default_factory' 238 f' and IOAttrs contains neither a soft_default' 239 f' nor a soft_default_factory;' 240 f' store_default=False cannot be set for it.' 241 )
Ensure the IOAttrs instance is ok to use with the provided field.
243 def validate_datetime( 244 self, value: datetime.datetime, fieldpath: str 245 ) -> None: 246 """Ensure a datetime value meets our value requirements.""" 247 if self.whole_days: 248 if any( 249 x != 0 250 for x in ( 251 value.hour, 252 value.minute, 253 value.second, 254 value.microsecond, 255 ) 256 ): 257 raise ValueError( 258 f'Value {value} at {fieldpath} is not a whole day.' 259 ) 260 elif self.whole_hours: 261 if any( 262 x != 0 for x in (value.minute, value.second, value.microsecond) 263 ): 264 raise ValueError( 265 f'Value {value} at {fieldpath}' f' is not a whole hour.' 266 ) 267 elif self.whole_minutes: 268 if any(x != 0 for x in (value.second, value.microsecond)): 269 raise ValueError( 270 f'Value {value} at {fieldpath}' f' is not a whole minute.' 271 )
Ensure a datetime value meets our value requirements.
41class IOExtendedData: 42 """A class that data types can inherit from for extra functionality.""" 43 44 def will_output(self) -> None: 45 """Called before data is sent to an outputter. 46 47 Can be overridden to validate or filter data before 48 sending it on its way. 49 """ 50 51 @classmethod 52 def will_input(cls, data: dict) -> None: 53 """Called on raw data before a class instance is created from it. 54 55 Can be overridden to migrate old data formats to new, etc. 56 """ 57 58 def did_input(self) -> None: 59 """Called on a class instance after created from data. 60 61 Can be useful to correct values from the db, etc. in the 62 type-safe form. 63 """ 64 65 # pylint: disable=useless-return 66 67 @classmethod 68 def handle_input_error(cls, exc: Exception) -> Self | None: 69 """Called when an error occurs during input decoding. 70 71 This allows a type to optionally return substitute data 72 to be used in place of the failed decode. If it returns 73 None, the original exception is re-raised. 74 75 It is generally a bad idea to apply catch-alls such as this, 76 as it can lead to silent data loss. This should only be used 77 in specific cases such as user settings where an occasional 78 reset is harmless and is preferable to keeping all contained 79 enums and other values backward compatible indefinitely. 80 """ 81 del exc # Unused. 82 83 # By default we let things fail. 84 return None 85 86 # pylint: enable=useless-return
A class that data types can inherit from for extra functionality.
44 def will_output(self) -> None: 45 """Called before data is sent to an outputter. 46 47 Can be overridden to validate or filter data before 48 sending it on its way. 49 """
Called before data is sent to an outputter.
Can be overridden to validate or filter data before sending it on its way.
51 @classmethod 52 def will_input(cls, data: dict) -> None: 53 """Called on raw data before a class instance is created from it. 54 55 Can be overridden to migrate old data formats to new, etc. 56 """
Called on raw data before a class instance is created from it.
Can be overridden to migrate old data formats to new, etc.
58 def did_input(self) -> None: 59 """Called on a class instance after created from data. 60 61 Can be useful to correct values from the db, etc. in the 62 type-safe form. 63 """
Called on a class instance after created from data.
Can be useful to correct values from the db, etc. in the type-safe form.
67 @classmethod 68 def handle_input_error(cls, exc: Exception) -> Self | None: 69 """Called when an error occurs during input decoding. 70 71 This allows a type to optionally return substitute data 72 to be used in place of the failed decode. If it returns 73 None, the original exception is re-raised. 74 75 It is generally a bad idea to apply catch-alls such as this, 76 as it can lead to silent data loss. This should only be used 77 in specific cases such as user settings where an occasional 78 reset is harmless and is preferable to keeping all contained 79 enums and other values backward compatible indefinitely. 80 """ 81 del exc # Unused. 82 83 # By default we let things fail. 84 return None
Called when an error occurs during input decoding.
This allows a type to optionally return substitute data to be used in place of the failed decode. If it returns None, the original exception is re-raised.
It is generally a bad idea to apply catch-alls such as this, as it can lead to silent data loss. This should only be used in specific cases such as user settings where an occasional reset is harmless and is preferable to keeping all contained enums and other values backward compatible indefinitely.
92class IOMultiType(Generic[EnumT]): 93 """A base class for types that can map to multiple dataclass types. 94 95 This enables usage of high level base classes (for example 96 a 'Message' type) in annotations, with dataclassio automatically 97 serializing & deserializing dataclass subclasses based on their 98 type ('MessagePing', 'MessageChat', etc.) 99 100 Standard usage involves creating a class which inherits from this 101 one which acts as a 'registry', and then creating dataclass classes 102 inheriting from that registry class. Dataclassio will then do the 103 right thing when that registry class is used in type annotations. 104 105 See tests/test_efro/test_dataclassio.py for examples. 106 """ 107 108 @classmethod 109 def get_type(cls, type_id: EnumT) -> type[Self]: 110 """Return a specific subclass given a type-id.""" 111 raise NotImplementedError() 112 113 @classmethod 114 def get_type_id(cls) -> EnumT: 115 """Return the type-id for this subclass.""" 116 raise NotImplementedError() 117 118 @classmethod 119 def get_type_id_type(cls) -> type[EnumT]: 120 """Return the Enum type this class uses as its type-id.""" 121 out: type[EnumT] = cls.__orig_bases__[0].__args__[0] # type: ignore 122 assert issubclass(out, Enum) 123 return out 124 125 @classmethod 126 def get_type_id_storage_name(cls) -> str: 127 """Return the key used to store type id in serialized data. 128 129 The default is an obscure value so that it does not conflict 130 with members of individual type attrs, but in some cases one 131 might prefer to serialize it to something simpler like 'type' 132 by overriding this call. One just needs to make sure that no 133 encompassed types serialize anything to 'type' themself. 134 """ 135 return '_dciotype'
A base class for types that can map to multiple dataclass types.
This enables usage of high level base classes (for example a 'Message' type) in annotations, with dataclassio automatically serializing & deserializing dataclass subclasses based on their type ('MessagePing', 'MessageChat', etc.)
Standard usage involves creating a class which inherits from this one which acts as a 'registry', and then creating dataclass classes inheriting from that registry class. Dataclassio will then do the right thing when that registry class is used in type annotations.
See tests/test_efro/test_dataclassio.py for examples.
108 @classmethod 109 def get_type(cls, type_id: EnumT) -> type[Self]: 110 """Return a specific subclass given a type-id.""" 111 raise NotImplementedError()
Return a specific subclass given a type-id.
113 @classmethod 114 def get_type_id(cls) -> EnumT: 115 """Return the type-id for this subclass.""" 116 raise NotImplementedError()
Return the type-id for this subclass.
118 @classmethod 119 def get_type_id_type(cls) -> type[EnumT]: 120 """Return the Enum type this class uses as its type-id.""" 121 out: type[EnumT] = cls.__orig_bases__[0].__args__[0] # type: ignore 122 assert issubclass(out, Enum) 123 return out
Return the Enum type this class uses as its type-id.
125 @classmethod 126 def get_type_id_storage_name(cls) -> str: 127 """Return the key used to store type id in serialized data. 128 129 The default is an obscure value so that it does not conflict 130 with members of individual type attrs, but in some cases one 131 might prefer to serialize it to something simpler like 'type' 132 by overriding this call. One just needs to make sure that no 133 encompassed types serialize anything to 'type' themself. 134 """ 135 return '_dciotype'
Return the key used to store type id in serialized data.
The default is an obscure value so that it does not conflict with members of individual type attrs, but in some cases one might prefer to serialize it to something simpler like 'type' by overriding this call. One just needs to make sure that no encompassed types serialize anything to 'type' themself.
28class JsonStyle(Enum): 29 """Different style types for json.""" 30 31 # Single line, no spaces, no sorting. Not deterministic. 32 # Use this where speed is more important than determinism. 33 FAST = 'fast' 34 35 # Single line, no spaces, sorted keys. Deterministic. 36 # Use this when output may be hashed or compared for equality. 37 SORTED = 'sorted' 38 39 # Multiple lines, spaces, sorted keys. Deterministic. 40 # Use this for pretty human readable output. 41 PRETTY = 'pretty'
Different style types for json.
100def dataclass_from_dict( 101 cls: type[T], 102 values: dict, 103 *, 104 codec: Codec = Codec.JSON, 105 coerce_to_float: bool = True, 106 allow_unknown_attrs: bool = True, 107 discard_unknown_attrs: bool = False, 108) -> T: 109 """Given a dict, return a dataclass of a given type. 110 111 The dict must be formatted to match the specified codec (generally 112 json-friendly object types). This means that sequence values such as 113 tuples or sets should be passed as lists, enums should be passed as 114 their associated values, nested dataclasses should be passed as dicts, 115 etc. 116 117 All values are checked to ensure their types/values are valid. 118 119 Data for attributes of type Any will be checked to ensure they match 120 types supported directly by json. This does not include types such 121 as tuples which are implicitly translated by Python's json module 122 (as this would break the ability to do a lossless round-trip with 123 data). 124 125 If coerce_to_float is True, int values passed for float typed fields 126 will be converted to float values. Otherwise, a TypeError is raised. 127 128 If `allow_unknown_attrs` is False, AttributeErrors will be raised for 129 attributes present in the dict but not on the data class. Otherwise, 130 they will be preserved as part of the instance and included if it is 131 exported back to a dict, unless `discard_unknown_attrs` is True, in 132 which case they will simply be discarded. 133 """ 134 val = _Inputter( 135 cls, 136 codec=codec, 137 coerce_to_float=coerce_to_float, 138 allow_unknown_attrs=allow_unknown_attrs, 139 discard_unknown_attrs=discard_unknown_attrs, 140 ).run(values) 141 assert isinstance(val, cls) 142 return val
Given a dict, return a dataclass of a given type.
The dict must be formatted to match the specified codec (generally json-friendly object types). This means that sequence values such as tuples or sets should be passed as lists, enums should be passed as their associated values, nested dataclasses should be passed as dicts, etc.
All values are checked to ensure their types/values are valid.
Data for attributes of type Any will be checked to ensure they match types supported directly by json. This does not include types such as tuples which are implicitly translated by Python's json module (as this would break the ability to do a lossless round-trip with data).
If coerce_to_float is True, int values passed for float typed fields will be converted to float values. Otherwise, a TypeError is raised.
If allow_unknown_attrs
is False, AttributeErrors will be raised for
attributes present in the dict but not on the data class. Otherwise,
they will be preserved as part of the instance and included if it is
exported back to a dict, unless discard_unknown_attrs
is True, in
which case they will simply be discarded.
145def dataclass_from_json( 146 cls: type[T], 147 json_str: str, 148 coerce_to_float: bool = True, 149 allow_unknown_attrs: bool = True, 150 discard_unknown_attrs: bool = False, 151) -> T: 152 """Return a dataclass instance given a json string. 153 154 Basically dataclass_from_dict(json.loads(...)) 155 """ 156 157 return dataclass_from_dict( 158 cls=cls, 159 values=json.loads(json_str), 160 coerce_to_float=coerce_to_float, 161 allow_unknown_attrs=allow_unknown_attrs, 162 discard_unknown_attrs=discard_unknown_attrs, 163 )
Return a dataclass instance given a json string.
Basically dataclass_from_dict(json.loads(...))
44def dataclass_to_dict( 45 obj: Any, 46 codec: Codec = Codec.JSON, 47 coerce_to_float: bool = True, 48 discard_extra_attrs: bool = False, 49) -> dict: 50 """Given a dataclass object, return a json-friendly dict. 51 52 All values will be checked to ensure they match the types specified 53 on fields. Note that a limited set of types and data configurations is 54 supported. 55 56 Values with type Any will be checked to ensure they match types supported 57 directly by json. This does not include types such as tuples which are 58 implicitly translated by Python's json module (as this would break 59 the ability to do a lossless round-trip with data). 60 61 If coerce_to_float is True, integer values present on float typed fields 62 will be converted to float in the dict output. If False, a TypeError 63 will be triggered. 64 """ 65 66 out = _Outputter( 67 obj, 68 create=True, 69 codec=codec, 70 coerce_to_float=coerce_to_float, 71 discard_extra_attrs=discard_extra_attrs, 72 ).run() 73 assert isinstance(out, dict) 74 return out
Given a dataclass object, return a json-friendly dict.
All values will be checked to ensure they match the types specified on fields. Note that a limited set of types and data configurations is supported.
Values with type Any will be checked to ensure they match types supported directly by json. This does not include types such as tuples which are implicitly translated by Python's json module (as this would break the ability to do a lossless round-trip with data).
If coerce_to_float is True, integer values present on float typed fields will be converted to float in the dict output. If False, a TypeError will be triggered.
77def dataclass_to_json( 78 obj: Any, 79 coerce_to_float: bool = True, 80 pretty: bool = False, 81 sort_keys: bool | None = None, 82) -> str: 83 """Utility function; return a json string from a dataclass instance. 84 85 Basically json.dumps(dataclass_to_dict(...)). 86 By default, keys are sorted for pretty output and not otherwise, but 87 this can be overridden by supplying a value for the 'sort_keys' arg. 88 """ 89 90 jdict = dataclass_to_dict( 91 obj=obj, coerce_to_float=coerce_to_float, codec=Codec.JSON 92 ) 93 if sort_keys is None: 94 sort_keys = pretty 95 if pretty: 96 return json.dumps(jdict, indent=2, sort_keys=sort_keys) 97 return json.dumps(jdict, separators=(',', ':'), sort_keys=sort_keys)
Utility function; return a json string from a dataclass instance.
Basically json.dumps(dataclass_to_dict(...)). By default, keys are sorted for pretty output and not otherwise, but this can be overridden by supplying a value for the 'sort_keys' arg.
166def dataclass_validate( 167 obj: Any, 168 coerce_to_float: bool = True, 169 codec: Codec = Codec.JSON, 170 discard_extra_attrs: bool = False, 171) -> None: 172 """Ensure that values in a dataclass instance are the correct types.""" 173 174 # Simply run an output pass but tell it not to generate data; 175 # only run validation. 176 _Outputter( 177 obj, 178 create=False, 179 codec=codec, 180 coerce_to_float=coerce_to_float, 181 discard_extra_attrs=discard_extra_attrs, 182 ).run()
Ensure that values in a dataclass instance are the correct types.
185def dataclass_hash(obj: Any, coerce_to_float: bool = True) -> str: 186 """Calculate a hash for the provided dataclass. 187 188 Basically this emits json for the dataclass (with keys sorted 189 to keep things deterministic) and hashes the resulting string. 190 """ 191 import hashlib 192 from base64 import urlsafe_b64encode 193 194 json_dict = dataclass_to_dict( 195 obj, codec=Codec.JSON, coerce_to_float=coerce_to_float 196 ) 197 198 # Need to sort keys to keep things deterministic. 199 json_str = json.dumps(json_dict, separators=(',', ':'), sort_keys=True) 200 201 sha = hashlib.sha256() 202 sha.update(json_str.encode()) 203 204 # Go with urlsafe base64 instead of the usual hex to save some 205 # space, and kill those ugly padding chars at the end. 206 return urlsafe_b64encode(sha.digest()).decode().strip('=')
Calculate a hash for the provided dataclass.
Basically this emits json for the dataclass (with keys sorted to keep things deterministic) and hashes the resulting string.
47def ioprep(cls: type, globalns: dict | None = None) -> None: 48 """Prep a dataclass type for use with this module's functionality. 49 50 Prepping ensures that all types contained in a data class as well as 51 the usage of said types are supported by this module and pre-builds 52 necessary constructs needed for encoding/decoding/etc. 53 54 Prepping will happen on-the-fly as needed, but a warning will be 55 emitted in such cases, as it is better to explicitly prep all used types 56 early in a process to ensure any invalid types or configuration are caught 57 immediately. 58 59 Prepping a dataclass involves evaluating its type annotations, which, 60 as of PEP 563, are stored simply as strings. This evaluation is done 61 with localns set to the class dict (so that types defined in the class 62 can be used) and globalns set to the containing module's class. 63 It is possible to override globalns for special cases such as when 64 prepping happens as part of an execed string instead of within a 65 module. 66 """ 67 PrepSession(explicit=True, globalns=globalns).prep_dataclass( 68 cls, recursion_level=0 69 )
Prep a dataclass type for use with this module's functionality.
Prepping ensures that all types contained in a data class as well as the usage of said types are supported by this module and pre-builds necessary constructs needed for encoding/decoding/etc.
Prepping will happen on-the-fly as needed, but a warning will be emitted in such cases, as it is better to explicitly prep all used types early in a process to ensure any invalid types or configuration are caught immediately.
Prepping a dataclass involves evaluating its type annotations, which, as of PEP 563, are stored simply as strings. This evaluation is done with localns set to the class dict (so that types defined in the class can be used) and globalns set to the containing module's class. It is possible to override globalns for special cases such as when prepping happens as part of an execed string instead of within a module.
72def ioprepped(cls: type[T]) -> type[T]: 73 """Class decorator for easily prepping a dataclass at definition time. 74 75 Note that in some cases it may not be possible to prep a dataclass 76 immediately (such as when its type annotations refer to forward-declared 77 types). In these cases, dataclass_prep() should be explicitly called for 78 the class as soon as possible; ideally at module import time to expose any 79 errors as early as possible in execution. 80 """ 81 ioprep(cls) 82 return cls
Class decorator for easily prepping a dataclass at definition time.
Note that in some cases it may not be possible to prep a dataclass immediately (such as when its type annotations refer to forward-declared types). In these cases, dataclass_prep() should be explicitly called for the class as soon as possible; ideally at module import time to expose any errors as early as possible in execution.
102def is_ioprepped_dataclass(obj: Any) -> bool: 103 """Return whether the obj is an ioprepped dataclass type or instance.""" 104 cls = obj if isinstance(obj, type) else type(obj) 105 return dataclasses.is_dataclass(cls) and hasattr(cls, PREP_ATTR)
Return whether the obj is an ioprepped dataclass type or instance.
85def will_ioprep(cls: type[T]) -> type[T]: 86 """Class decorator hinting that we will prep a class later. 87 88 In some cases (such as recursive types) we cannot use the @ioprepped 89 decorator and must instead call ioprep() explicitly later. However, 90 some of our custom pylint checking behaves differently when the 91 @ioprepped decorator is present, in that case requiring type annotations 92 to be present and not simply forward declared under an "if TYPE_CHECKING" 93 block. (since they are used at runtime). 94 95 The @will_ioprep decorator triggers the same pylint behavior 96 differences as @ioprepped (which are necessary for the later ioprep() call 97 to work correctly) but without actually running any prep itself. 98 """ 99 return cls
Class decorator hinting that we will prep a class later.
In some cases (such as recursive types) we cannot use the @ioprepped decorator and must instead call ioprep() explicitly later. However, some of our custom pylint checking behaves differently when the @ioprepped decorator is present, in that case requiring type annotations to be present and not simply forward declared under an "if TYPE_CHECKING" block. (since they are used at runtime).
The @will_ioprep decorator triggers the same pylint behavior differences as @ioprepped (which are necessary for the later ioprep() call to work correctly) but without actually running any prep itself.