Event Store & Databases
The Event Store is the database that stores all events and model state. Bitcoin Crawler supports multiple database backends with automatic schema management - you control the database, framework handles everything else.
What is Event Store?
Event Store persists:
- Events: Immutable records of all state changes
- Model State: Current state derived from events
- Metadata: Block heights, timestamps, versions
Your models generate events → Framework stores in Event Store → State reconstructed from events.
Storage Flow
┌─────────────────────────────────────────────────────────┐
│ Event Store Architecture │
└────────────────────────────── ───────────────────────────┘
Model generates event
│
▼
┌──────────────────┐
│ Event Store │ Framework manages
│ Component │
└──────────────────┘
│
▼
┌──────────────────┐
│ Database │ You configure and control
│ (SQLite/ │
│ Postgres/ │
│ IndexedDB) │
└────────── ────────┘
│
├────────────────┬──────────────────┐
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Events │ │ State │ │ Metadata │
│ Table │ │ Snapshots │ │ Tables │
└────────────┘ └────────────┘ └────────────┘
Supported Databases
SQLite
File-based database, perfect for development and smaller deployments.
Advantages:
- Single file, easy to manage
- Zero configuration
- Fast for moderate data
- Easy backups (copy file)
- No separate server needed
- Works on desktop applications
Best for:
- Development and testing
- Small to medium projects (< 1M events)
- Desktop applications
- Single-server deployments
- Quick prototypes
Limitations:
- Single writer (no concurrent writes from multiple processes)
- Limited for very large datasets
- No built-in replication
Configuration: Just specify file path - framework creates and manages database.
PostgreSQL
Production-grade relational database for larger deployments.
Advantages:
- Handles millions of events
- Excellent performance for large datasets
- Concurrent access from multiple applications
- Advanced indexing
- Replication support
- Connection pooling
- Industry-standard
Best for:
- Production deployments
- Large datasets (> 1M events)
- High query volume
- Multiple indexers sharing database server
- When you need replication
- Enterprise applications
Requirements:
- PostgreSQL server running
- Database created
- User with appropriate permissions
Configuration: Provide connection settings (host, port, username, password, database name).
IndexedDB
Browser-based storage for client-side applications.
Advantages:
- Works in browser without server
- Client-side privacy (data never leaves browser)
- No server costs
- Offline-capable
Limitations:
- Storage quota limits (browser-dependent)
- Performance slower than server databases
- Can't handle massive datasets
- Limited query capabilities compared to SQL
Best for:
- Browser extensions
- Client-side blockchain explorers
- Privacy-focused applications
- Offline-first applications
- When server deployment impossible
Configuration: Specify database name for browser environment.
Database Management
Automatic Schema Creation
Framework manages database schema automatically:
First run:
- Creates all necessary tables
- Sets up indexes
- Configures constraints
- Optimizes for Event Sourcing patterns
Schema updates:
- Handles schema changes automatically (when enabled)
- Adds new tables/columns as needed
- Maintains backward compatibility
User control:
- Enable/disable automatic schema management
- Recommended: enable in development, manual in production
Your Data, Your Control
You own the database:
- Deploy on your infrastructure
- Configure access controls
- Manage backups
- Set retention policies
- Share between applications (PostgreSQL)
Framework responsibility:
- Schema structure
- Event storage operations
- Query optimization
- Index management
Storage Growth
Event Store size grows over time based on:
Factors:
- Number of models
- Events per block
- Event payload size
- Historical depth (how far back you index)
- Mempool monitoring (if enabled)
Optimization:
- Store only necessary data in event payloads
- Don't generate events for irrelevant data
- Use appropriate database (PostgreSQL for large)
- Consider retention policies for very old events
Planning:
Simple model (balance tracking):
~100 KB per 1000 blocks
Complex model (full transaction parsing):
~10 MB per 1000 blocks
Mempool monitoring:
+50-100 MB per day (depends on activity)
Calculate based on your model and plan storage accordingly.
Performance Considerations
SQLite Optimization
Benefits:
- Simple file-based
- Fast for moderate datasets
- Low overhead
Considerations:
- Single writer limitation
- Performance degrades with very large databases
- Consider PostgreSQL if exceeding 10GB
PostgreSQL Optimization
Benefits:
- Excellent large dataset performance
- Concurrent access support
- Advanced query optimization
- Replication support
Tuning:
- Configure connection pool size
- Adjust cache settings
- Monitor query performance
- Add indexes for your query patterns
Scaling:
- Use read replicas for query-heavy workloads
- Partition large tables (if needed)
- Separate database server from crawler
Backup and Recovery
Backup Strategy
SQLite:
- Stop application or ensure no writes
- Copy database file
- Automated with cron/scheduled tasks
- Store backups off-site
PostgreSQL:
- Use pg_dump for logical backups
- Configure WAL archiving for point-in-time recovery
- Set up automated backup schedule
- Test restore procedures regularly
Best practices:
- Regular schedule (daily minimum, hourly for critical)
- Off-site storage
- Test restores periodically
- Keep multiple backup versions
- Encrypt backups if sensitive data
Disaster Recovery
Corrupted database:
- Restore from latest backup
- Replay from last known good state
- Verify data integrity after restore
Data loss:
- Restore from backup
- May need to re-sync from affected block height
- Event Sourcing helps recovery
Multi-Application Setup
Sharing PostgreSQL Server
Multiple crawler instances can share PostgreSQL server:
Approach:
- Each crawler uses separate database
- Or separate schemas in same database
- Share PostgreSQL server resources
- Manage connections with pooling
Benefits:
- Resource efficiency
- Centralized management
- Easier backups
- Cost savings
Considerations:
- Monitor total connections
- Tune PostgreSQL for multiple clients
- Consider disk I/O capacity
Security
Access Control
SQLite:
- File system permissions
- Restrict access to database file
- Encrypt file system if needed
PostgreSQL:
- Dedicated database user per crawler
- Grant minimum required permissions
- Use SSL for connections
- Network firewall rules
General:
- Never expose database directly to internet
- Use strong passwords
- Regular security updates
- Monitor access logs
Data Protection
Encryption:
- At rest: Encrypt file system or use database encryption
- In transit: Use SSL for PostgreSQL connections
- Backups: Encrypt backup files
Compliance:
- Your data stays on your infrastructure
- You control retention policies
- Full audit trail via Event Sourcing
- Meet regulatory requirements