Why Bag?
Most software systems deal with structured things: web pages, configuration files, APIs, database schemas. We usually treat these as separate worlds, each with its own language and tools.
Yet they all share something very simple: they are organized hierarchically, and humans reason about them by location rather than by mechanism.
The typical toolbox problem
A typical Python project dealing with hierarchical data ends up using:
omegaconforhydrafor hierarchical configurationpydanticfor validation and nested modelsmunch/addictordotmapfor dot notation accessrxpyorreactivexfor reactivitylxmlorxmltodictfor XML handlingjsonpointeror manual dict walking for path resolutionCustom decorators for lazy loading from DB or API
Plenty of glue code (wrappers, adapters, utility functions) to make these tools talk to each other
The result:
10+ dependencies in
requirements.txt5-6 different styles to access the same data (
.,[],/,get()with defaults, etc.)Time spent chasing incompatible updates
Fragmented code, hard to reason about and explain to newcomers
Low real productivity, despite the impression of “using the most modern stack”
One model, one way of thinking
Bag proposes a different approach: one structure, one mental model, one access point.
With Bag you get, in the same object:
Need |
Typical solution(s) |
With Bag |
|---|---|---|
Hierarchical data |
dict + manual nesting |
Native path-based access |
Dot notation |
munch/addict |
Built-in |
Configuration |
omegaconf, hydra |
Bag + paths |
Structural validation |
Custom code |
Bag + genro-builders |
Lazy/computed values |
@property, custom decorators |
Transparent resolvers (sync/async) |
Reactivity |
rxpy, signals, custom events |
Location-based subscriptions |
XML/JSON handling |
lxml, xmltodict, json |
Unified serialization |
Glue code / adapters |
Many custom utils |
Almost none |
What about pydantic?
A common question: “Why not just use pydantic?”
Pydantic is excellent for schema validation: you define a model with type hints, and pydantic validates and coerces incoming data. It’s the right tool when you know the exact shape of your data upfront and want strict type checking.
Bag solves a different problem:
Aspect |
Pydantic |
Bag |
|---|---|---|
Primary purpose |
Schema validation |
Hierarchical data manipulation |
Schema |
Required upfront |
Optional (via genro-builders) |
Structure |
Fixed at definition |
Dynamic, can grow |
Access pattern |
Attribute access on models |
Path-based access anywhere |
Reactivity |
None built-in |
Subscriptions on any node |
Lazy loading |
Not built-in |
Resolvers (sync/async) |
Serialization |
JSON primarily |
XML, JSON, MessagePack |
Type preservation |
Yes, via schema |
Yes, via TYTX format |
When to use pydantic: API request/response validation, configuration with known schema, data transfer objects.
When to use Bag: Dynamic hierarchical structures, document manipulation, reactive state, lazy-loaded trees, multi-format serialization.
They can coexist: use pydantic at system boundaries (API input validation), Bag for internal hierarchical state management.
The real benefit
The developer writes less glue code, always reasons about the same mental model, and spends more time solving the domain problem instead of being a systems integrator.
This is a huge advantage especially in contexts where:
Data structure is complex and long-lived (configurations, DB schemas, UI trees, business documents)
There are multiple sources/sinks (frontend, backend, files, APIs, databases)
The team is heterogeneous or has turnover
The project must live for years
In practice, it’s the “less is more” approach applied to data structures: one well-designed library covering 80-90% of common use cases, instead of 10 specialized libraries covering 100% but with a very high operational cost.
Summary
One coherent model. Less glue. More domain logic. Higher velocity.
>>> from genro_bag import Bag
>>> # One structure for everything
>>> config = Bag()
>>> config['database.host'] = 'localhost'
>>> config['database.port'] = 5432
>>> config['database.host']
'localhost'
>>> # With metadata
>>> config.set_item('database.pool_size', 10, min=1, max=100)
>>> config['database.pool_size?max']
100
>>> # Reactive
>>> changes = []
>>> config.subscribe('watcher', any=lambda **kw: changes.append(kw['evt']))
>>> config['database.timeout'] = 30
>>> 'ins' in changes
True
See the Benchmarks for performance characteristics and comparisons with other approaches.