Event Store & Databases

The Event Store is the database that stores all events and model state. Bitcoin Crawler supports multiple database backends with automatic schema management - you control the database, framework handles everything else.

What is Event Store?

Event Store persists:

Events: Immutable records of all state changes
Model State: Current state derived from events
Metadata: Block heights, timestamps, versions

Your models generate events → Framework stores in Event Store → State reconstructed from events.

Storage Flow

┌─────────────────────────────────────────────────────────┐
│              Event Store Architecture                    │
└─────────────────────────────────────────────────────────┘

Model generates event
       │
       ▼
┌──────────────────┐
│  Event Store     │  Framework manages
│   Component      │
└──────────────────┘
       │
       ▼
┌──────────────────┐
│   Database       │  You configure and control
│  (SQLite/       │
│   Postgres/     │
│   IndexedDB)    │
└──────────────────┘
       │
       ├────────────────┬──────────────────┐
       ▼                ▼                  ▼
┌────────────┐   ┌────────────┐    ┌────────────┐
│   Events   │   │   State    │    │  Metadata  │
│   Table    │   │  Snapshots │    │   Tables   │
└────────────┘   └────────────┘    └────────────┘

Supported Databases

SQLite

File-based database, perfect for development and smaller deployments.

Advantages:

Single file, easy to manage
Zero configuration
Fast for moderate data
Easy backups (copy file)
No separate server needed
Works on desktop applications

Best for:

Development and testing
Small to medium projects (< 1M events)
Desktop applications
Single-server deployments
Quick prototypes

Limitations:

Single writer (no concurrent writes from multiple processes)
Limited for very large datasets
No built-in replication

Configuration: Just specify file path - framework creates and manages database.

PostgreSQL

Production-grade relational database for larger deployments.

Advantages:

Handles millions of events
Excellent performance for large datasets
Concurrent access from multiple applications
Advanced indexing
Replication support
Connection pooling
Industry-standard

Best for:

Production deployments
Large datasets (> 1M events)
High query volume
Multiple indexers sharing database server
When you need replication
Enterprise applications

Requirements:

PostgreSQL server running
Database created
User with appropriate permissions

Configuration: Provide connection settings (host, port, username, password, database name).

IndexedDB

Browser-based storage for client-side applications.

Advantages:

Works in browser without server
Client-side privacy (data never leaves browser)
No server costs
Offline-capable

Limitations:

Storage quota limits (browser-dependent)
Performance slower than server databases
Can't handle massive datasets
Limited query capabilities compared to SQL

Best for:

Browser extensions
Client-side blockchain explorers
Privacy-focused applications
Offline-first applications
When server deployment impossible

Configuration: Specify database name for browser environment.

Database Management

Automatic Schema Creation

Framework manages database schema automatically:

First run:

Creates all necessary tables
Sets up indexes
Configures constraints
Optimizes for Event Sourcing patterns

Schema updates:

Handles schema changes automatically (when enabled)
Adds new tables/columns as needed
Maintains backward compatibility

User control:

Enable/disable automatic schema management
Recommended: enable in development, manual in production

Your Data, Your Control

You own the database:

Deploy on your infrastructure
Configure access controls
Manage backups
Set retention policies
Share between applications (PostgreSQL)

Framework responsibility:

Schema structure
Event storage operations
Query optimization
Index management

Storage Growth

Event Store size grows over time based on:

Factors:

Number of models
Events per block
Event payload size
Historical depth (how far back you index)
Mempool monitoring (if enabled)

Optimization:

Store only necessary data in event payloads
Don't generate events for irrelevant data
Use appropriate database (PostgreSQL for large)
Consider retention policies for very old events

Planning:

Simple model (balance tracking):
  ~100 KB per 1000 blocks

Complex model (full transaction parsing):
  ~10 MB per 1000 blocks

Mempool monitoring:
  +50-100 MB per day (depends on activity)

Calculate based on your model and plan storage accordingly.

Performance Considerations

SQLite Optimization

Benefits:

Simple file-based
Fast for moderate datasets
Low overhead

Considerations:

Single writer limitation
Performance degrades with very large databases
Consider PostgreSQL if exceeding 10GB

PostgreSQL Optimization

Benefits:

Excellent large dataset performance
Concurrent access support
Advanced query optimization
Replication support

Tuning:

Configure connection pool size
Adjust cache settings
Monitor query performance
Add indexes for your query patterns

Scaling:

Use read replicas for query-heavy workloads
Partition large tables (if needed)
Separate database server from crawler

Backup and Recovery

Backup Strategy

SQLite:

Stop application or ensure no writes
Copy database file
Automated with cron/scheduled tasks
Store backups off-site

PostgreSQL:

Use pg_dump for logical backups
Configure WAL archiving for point-in-time recovery
Set up automated backup schedule
Test restore procedures regularly

Best practices:

Regular schedule (daily minimum, hourly for critical)
Off-site storage
Test restores periodically
Keep multiple backup versions
Encrypt backups if sensitive data

Disaster Recovery

Corrupted database:

Restore from latest backup
Replay from last known good state
Verify data integrity after restore

Data loss:

Restore from backup
May need to re-sync from affected block height
Event Sourcing helps recovery

Multi-Application Setup

Multiple crawler instances can share PostgreSQL server:

Approach:

Each crawler uses separate database
Or separate schemas in same database
Share PostgreSQL server resources
Manage connections with pooling

Benefits:

Resource efficiency
Centralized management
Easier backups
Cost savings

Considerations:

Monitor total connections
Tune PostgreSQL for multiple clients
Consider disk I/O capacity

Security

Access Control

SQLite:

File system permissions
Restrict access to database file
Encrypt file system if needed

PostgreSQL:

Dedicated database user per crawler
Grant minimum required permissions
Use SSL for connections
Network firewall rules

General:

Never expose database directly to internet
Use strong passwords
Regular security updates
Monitor access logs

Data Protection

Encryption:

At rest: Encrypt file system or use database encryption
In transit: Use SSL for PostgreSQL connections
Backups: Encrypt backup files

Compliance:

Your data stays on your infrastructure
You control retention policies
Full audit trail via Event Sourcing
Meet regulatory requirements

Best Practices

Database Selection

Decision tree:

Development/Testing
    └──> SQLite

Desktop Application
    └──> SQLite

Small Project (< 1M events)
    └──> SQLite or PostgreSQL

Production (> 1M events)
    └──> PostgreSQL

Browser Extension
    └──> IndexedDB

High Concurrency
    └──> PostgreSQL

Multiple Indexers
    └──> PostgreSQL (shared server)

Capacity Planning

Estimate event volume: Events per block × blocks per day
Calculate storage: Event size × total events
Add buffer: 30-50% for indexes and metadata
Plan growth: Factor in historical sync + ongoing
Monitor actual usage: Adjust as needed

Maintenance

SQLite:

Run VACUUM periodically to reclaim space
Monitor file size growth
Check integrity with PRAGMA integrity_check

PostgreSQL:

Configure autovacuum appropriately
Monitor table bloat
REINDEX if needed
Update statistics regularly

Monitoring

Track metrics:

Database size
Growth rate
Query performance
Connection pool usage
Disk space available

Set alerts:

Disk space warnings
Slow query detection
Connection pool exhaustion
Database errors

Reorganization Impact

When blockchain reorganizes, Event Store automatically handles it:

What happens:

Events from orphaned blocks marked as rolled back
State recalculated without those events
New blocks processed
New events stored
State updated

Database operations:

No events deleted (immutable log maintained)
Rollback markers added
New events appended
State snapshots updated

Your responsibility: None - automatic

Architecture Context

Event Store integrates with all system components:

Models generate events → Event Store persists Event Sourcing pattern → Event Store implements Transport queries data → Event Store provides Database stores → Event Store manages

Event Store is the persistence layer - it ensures all state changes are safely stored and retrievable. events) └──> SQLite or PostgreSQL

What is Event Store?​

Storage Flow​

Supported Databases​

SQLite​

PostgreSQL​

IndexedDB​

Database Management​

Automatic Schema Creation​

Your Data, Your Control​

Storage Growth​

Performance Considerations​

SQLite Optimization​

PostgreSQL Optimization​

Backup and Recovery​

Backup Strategy​

Disaster Recovery​

Multi-Application Setup​

Sharing PostgreSQL Server​

Security​

Access Control​

Data Protection​

Best Practices​

Database Selection​

Capacity Planning​

Maintenance​

Monitoring​

Reorganization Impact​

Architecture Context​

What is Event Store?

Storage Flow

Supported Databases

SQLite

PostgreSQL

IndexedDB

Database Management

Automatic Schema Creation

Your Data, Your Control

Storage Growth

Performance Considerations

SQLite Optimization

PostgreSQL Optimization

Backup and Recovery

Backup Strategy

Disaster Recovery

Multi-Application Setup

Sharing PostgreSQL Server

Security

Access Control

Data Protection

Best Practices

Database Selection

Capacity Planning

Maintenance

Monitoring

Reorganization Impact

Architecture Context