Po8 Network | Post-Quantum Privacy Protocol

Core Mechanism

TensorChain Proof of Useful Work

TensorChain saturates unified memory bandwidth rather than raw TFLOPS, flipping the economics in favor of consumer hardware. The puzzle is tuned to target 75% of available system RAM on high-end consumer devices.

Seed derivation from previous block hash + miner nonce via SHAKE256

Deterministic matrix generation: A, B sized to ~100 GB baseline (above H100 VRAM)

Compute noisy product C' = (A+E)·(B+F) using Neural Engine INT8 tensor units

Digest via Merkle root of diagonal elements, hashed into succinct proof

Verification via Freivalds' algorithm in O(n²) instead of O(n³)

Proof of Memory Capacity

By setting matrix size N larger than H100 VRAM (80GB) but smaller than Mac Studio UMA (192GB), TensorChain creates a "Proof of Memory Capacity and Bandwidth" that physically excludes PCIe-bound GPU rigs.

Economic Thesis

The Batch-1 Efficiency Gap

Industrial GPUs collapse in efficiency when forced to process single inference requests. Consumer NPUs are optimized for exactly this workload.

Metric	Nvidia H100 (Industrial)	Apple M2 Ultra (Consumer)
Optimal Batch Size	≥ 64	1
Joules per Token (Batch 1)	~15 J	~11 J
Memory Access	CPU → PCIe → VRAM copies	Unified Memory (0 copy)
Outcome	Expensive latency overhead	Native advantage

By mandating sequential, low-batch inference operations, Po8 forces industrial miners to operate in their most inefficient regime while consumer devices operate in their optimal regime. This economic inversion is the key to decentralization.

Supported Devices

Hardware Configurations

Validator Tier

Mac Studio (M2/M3 Ultra) with 128 GB+ RAM. Full node + miner + mixnet relay. Maximum TensorChain participation.

Miner Tier

MacBook Pro M-Series Max with 64 GB RAM. Sequential TensorChain workloads. Efficient batch-1 inference.

Edge Tier

Kneron KL720 USB accelerator. Participates via sharded mining pools. Memory-light CNN workloads.

Mobile Tier

Mobile NPUs via sharded task decomposition. Contributes to aggregate network security through pooled resources.

Tensor sizes automatically adapt to fill available unified memory without swapping. The scheduler routes memory-heavy tasks to UMA nodes and compute-heavy tasks to accelerator nodes.

Useful Work

InferNet Layer

Beyond entropy generation, InferNet utilizes NPUs for useful AI inference tasks with economic value.

Optimistic Verification

Miners run models and post results with staked bonds. Fishermen re-execute off-chain during challenge windows.

Bisection Protocol

Disputes are mediated down to a single instruction. The divergent operation is executed on-chain to determine truth.

INT8 Determinism

Strict INT8 quantization ensures bit-for-bit identical outputs across all hardware—Kneron dongles match M2 Ultras.

Model Registry

On-chain registry tracks supported models with quantization parameters, ONNX graph hashes, and licensing metadata.

Sharded Mining

Pool Architecture

Not everyone owns high-end workstations. Sharded mining enables participation from modular accelerators and mobile devices.

Workload Decomposition

Large matrices decomposed into sub-blocks
Kneron nodes assigned specific sub-blocks to compute
Results aggregated by pool coordinators
Rewards distributed proportionally to contribution

Reconfigurable Data Paths

Kneron architecture switches operation types at runtime
Conv2D to Dilated Convolution without reloading
High utilization even on fragmented workloads
Native protocol support for heterogeneous pools