Why Bag?

Most software systems deal with structured things: web pages, configuration files, APIs, database schemas. We usually treat these as separate worlds, each with its own language and tools.

Yet they all share something very simple: they are organized hierarchically, and humans reason about them by location rather than by mechanism.

The typical toolbox problem

A typical Python project dealing with hierarchical data ends up using:

  • omegaconf or hydra for hierarchical configuration

  • pydantic for validation and nested models

  • munch/addict or dotmap for dot notation access

  • rxpy or reactivex for reactivity

  • lxml or xmltodict for XML handling

  • jsonpointer or manual dict walking for path resolution

  • Custom decorators for lazy loading from DB or API

  • Plenty of glue code (wrappers, adapters, utility functions) to make these tools talk to each other

The result:

  • 10+ dependencies in requirements.txt

  • 5-6 different styles to access the same data (., [], /, get() with defaults, etc.)

  • Time spent chasing incompatible updates

  • Fragmented code, hard to reason about and explain to newcomers

  • Low real productivity, despite the impression of “using the most modern stack”

One model, one way of thinking

Bag proposes a different approach: one structure, one mental model, one access point.

With Bag you get, in the same object:

Need

Typical solution(s)

With Bag

Hierarchical data

dict + manual nesting

Native path-based access

Dot notation

munch/addict

Built-in

Configuration

omegaconf, hydra

Bag + paths

Structural validation

Custom code

Bag + genro-builders

Lazy/computed values

@property, custom decorators

Transparent resolvers (sync/async)

Reactivity

rxpy, signals, custom events

Location-based subscriptions

XML/JSON handling

lxml, xmltodict, json

Unified serialization

Glue code / adapters

Many custom utils

Almost none

What about pydantic?

A common question: “Why not just use pydantic?”

Pydantic is excellent for schema validation: you define a model with type hints, and pydantic validates and coerces incoming data. It’s the right tool when you know the exact shape of your data upfront and want strict type checking.

Bag solves a different problem:

Aspect

Pydantic

Bag

Primary purpose

Schema validation

Hierarchical data manipulation

Schema

Required upfront

Optional (via genro-builders)

Structure

Fixed at definition

Dynamic, can grow

Access pattern

Attribute access on models

Path-based access anywhere

Reactivity

None built-in

Subscriptions on any node

Lazy loading

Not built-in

Resolvers (sync/async)

Serialization

JSON primarily

XML, JSON, MessagePack

Type preservation

Yes, via schema

Yes, via TYTX format

When to use pydantic: API request/response validation, configuration with known schema, data transfer objects.

When to use Bag: Dynamic hierarchical structures, document manipulation, reactive state, lazy-loaded trees, multi-format serialization.

They can coexist: use pydantic at system boundaries (API input validation), Bag for internal hierarchical state management.

The real benefit

The developer writes less glue code, always reasons about the same mental model, and spends more time solving the domain problem instead of being a systems integrator.

This is a huge advantage especially in contexts where:

  • Data structure is complex and long-lived (configurations, DB schemas, UI trees, business documents)

  • There are multiple sources/sinks (frontend, backend, files, APIs, databases)

  • The team is heterogeneous or has turnover

  • The project must live for years

In practice, it’s the “less is more” approach applied to data structures: one well-designed library covering 80-90% of common use cases, instead of 10 specialized libraries covering 100% but with a very high operational cost.

Summary

One coherent model. Less glue. More domain logic. Higher velocity.

>>> from genro_bag import Bag

>>> # One structure for everything
>>> config = Bag()
>>> config['database.host'] = 'localhost'
>>> config['database.port'] = 5432
>>> config['database.host']
'localhost'

>>> # With metadata
>>> config.set_item('database.pool_size', 10, min=1, max=100)
>>> config['database.pool_size?max']
100

>>> # Reactive
>>> changes = []
>>> config.subscribe('watcher', any=lambda **kw: changes.append(kw['evt']))
>>> config['database.timeout'] = 30
>>> 'ins' in changes
True

See the Benchmarks for performance characteristics and comparisons with other approaches.