Archive JSONL logs at 5–13% of original size. Query any time window instantly with DuckDB — directly from S3, no restore, no full decompression. No egress fees, no heavy infrastructure, no lock-in.
From log collection to millisecond queries — a complete, self-hosted stack. No managed services, no vendor dependencies.
Purpose-built patterns for the 10 most common log types. If your team produces it, PFC-JSONL understands it.
Five problems. One pipeline to solve them all.
Every alternative has a hidden cost. Here's the honest breakdown.
| Tool | Cost | Setup | Random Access | Your Infra |
|---|---|---|---|---|
| S3 Select | $0.002/GB scanned + egress | AWS-only | ❌ No block index | ❌ AWS lock-in |
| Athena | $5 per TB scanned | Glue catalog + partitioning | ❌ Full file scan | ❌ AWS lock-in |
| Parquet + Athena | $5 per TB scanned | Schema upfront, complex pipeline | ⚠️ Row groups only | ❌ AWS lock-in |
| Elasticsearch / ELK | Cluster cost + 2–3× storage | 3–5 servers, 16GB+ RAM | ✅ But expensive | ✅ Self-hosted |
| Loki (Grafana) | Cluster or Grafana Cloud fees | Kubernetes sidecar + object store | ❌ No block-level seek | ⚠️ Complex ops |
| PFC-JSONL Pipeline | Free for personal / OSS | 1 command | ✅ Block-level index | ✅ Runs anywhere |
Tell us about your current setup — see exactly what you pay today and what you'd save with PFC.
Based on AWS S3 Standard ($0.023/GB/mo), Athena ($5/TB scanned), PFC 5–13% compression ratio (API access logs: ~12.8%, infra/system logs: ~5.3%). *Same-region access assumed; internet egress applies at $0.09/GB.
Every piece of your pipeline — from log ingestion to database archiving to SQL queries — covered by purpose-built tools that all speak PFC.
Not another gzip wrapper. A purpose-built compression pipeline for structured log data.
Burrows-Wheeler Transform reorders data for maximum symbol locality. Sparse rANS O2 entropy coding achieves near-theoretical compression limits. Block structure enables parallel compression and random access.
| Compressor | Ratio | Random Access |
|---|---|---|
| PFC-JSONL | ✅ Block-indexed | |
| gzip -9 | ❌ Stream only | |
| zstd -3 | ❌ Stream only | |
| xz -6 | ❌ Very slow |
Ratio varies by log type. Tested on 1 GB real-world datasets across 10 enterprise log formats. Full benchmark report ↗
Three paths into the pipeline. Pick the one that fits your stack.
-- Install once INSTALL pfc FROM community; LOAD pfc; -- Query with timestamp filtering SELECT level, message, service FROM read_pfc_jsonl('logs/2026-01-01.pfc') WHERE ts >= 1735686000 AND ts < 1735689600 AND level = 'ERROR'; -- Works on local files or mounted paths -- Only decompresses matching blocks
# Step 1: Install pfc_jsonl binary curl -L https://github.com/ImpossibleForge/pfc-jsonl/releases/latest/download/pfc_jsonl-linux-x64 \ -o /usr/local/bin/pfc_jsonl && chmod +x /usr/local/bin/pfc_jsonl # Step 2: Download and start the forwarder curl -L https://raw.githubusercontent.com/ImpossibleForge/pfc-fluentbit/main/pfc_forwarder.py \ -o /opt/pfc_forwarder.py python3 /opt/pfc_forwarder.py # Step 3: Point Fluent Bit at it (fluent-bit.conf) # [OUTPUT] # Name tcp # Match * # Host 127.0.0.1 # Port 5170 # Format json_lines
# Convert existing S3 archives (no egress — runs in-region) pip install "pfc-migrate[s3]" pfc-migrate s3 --bucket my-logs --prefix 2025/ --pattern "*.gz" # Azure Blob Storage pip install "pfc-migrate[azure]" pfc-migrate azure --container logs --pattern "*.gz" # Lossless verified — MD5 checked before original is touched # Supports: .gz / .bz2 / .zst / .lz4 → .pfc
Free for personal use and open-source projects.
No account. No signup. No usage limits. No phone-home.
Your data stays in your infrastructure.
Commercial use? Contact us.
Straight answers, no fluff.
Free for personal and open-source use. No account, no signup, no limits.
Commercial use? [email protected]