How 3-Pass Rebalancing Works
Stock rebalancing sounds simple. Move inventory from where it is not selling to where it is. In practice, doing this well across a multi-branch retail network is a surprisingly deep problem.
Replenify's rebalancing engine uses a 3-pass architecture, where each pass solves a distinct class of problem and each builds on the results of the one before it. This post explains how the passes work, why the order matters, and what makes the whole system deterministic.
Why three passes?
A single-pass approach -- "look at all stock, figure out where it should go" -- sounds efficient but produces poor results. The reason is that different types of stock movement have different economics, different constraints, and different priorities.
Warehouse stock sitting in a distribution centre is already costing you money in carrying costs and is generating zero revenue. Getting it to stores is the highest priority. Inter-store transfers have freight costs that must be weighed against potential uplift. Fragmented stock scattered across stores in unsellable quantities needs consolidation before it can be useful.
Trying to optimise all three simultaneously creates a combinatorial explosion. Solving them sequentially, in the right order, produces better results in a fraction of the time.
Pass 1: Warehouse to retail
The first pass handles the simplest and highest-value problem: pushing warehouse stock to stores that can sell it.
How it works
The engine evaluates every SKU in the warehouse against every store's demand profile. For each potential push, it considers:
- Demand signal. Does this store have recent sales velocity for this SKU or similar SKUs? Is the store currently stocked out or running low?
- Size run viability. Pushing a single size 9 to a store that has no other sizes in that style creates a fragment, not a sellable offering. The engine checks whether a push completes or extends a viable size run.
- Store capacity. Stores have physical and financial limits. The engine respects both.
- Minimum push quantities. Sending one unit to a store is rarely worth the logistics cost. Configurable minimums ensure pushes are operationally sensible.
In the pilot, Pass 1 generated 129 warehouse-to-retail moves. These are the lowest-risk, highest-return moves because the stock is already in a central location with established distribution logistics.
What makes it different
Most warehouse allocation systems push stock uniformly -- every store gets its "fair share" based on a simple formula. Replenify's Pass 1 pushes stock to where demand actually exists, informed by real sell-through data and size curve intelligence at the store level.
Pass 2: Inter-store balancing
This is where it gets interesting. Pass 2 handles the most complex and highest-volume problem: moving stock between stores.
The distance-aware routing challenge
Moving stock between stores is not free. A transfer from a Johannesburg store to a Cape Town store costs R200-500 per box in freight. A transfer between two stores in the same shopping centre might cost R50. The engine must weigh the potential gross profit uplift of every move against its actual transport cost.
This is why naive rebalancing fails. A spreadsheet can tell you that Store A has too much and Store B has too little. It cannot tell you whether the freight cost of fixing that imbalance exceeds the margin you will recover.
How inter-store balancing works
For every SKU at every store, the engine evaluates potential transfers to every other store. The evaluation pipeline runs each candidate move through the 17-stage NAE (Need Assessment Engine) pipeline:
- Demand verification. Does the receiving store have genuine demand for this SKU?
- Stock surplus confirmation. Does the sending store have genuine surplus -- stock beyond what it can sell in the remaining season?
- Size run impact at source. Will removing this stock break a sellable size run at the sending store? Breaking a size run at the source to fix one at the destination is usually a net negative.
- Size run impact at destination. Does this move complete or extend a sellable size run at the receiving store?
- Distance cost calculation. What is the actual freight cost for this specific origin-destination pair?
- GP uplift estimation. What is the expected gross profit impact, accounting for the probability of sale at full price versus markdown at each location?
- Minimum quantity aggregation. Can this move be bundled with other moves on the same route to meet minimum shipment thresholds?
- Cannibalisation check. Will this move reduce sales at the sending store, or is the stock genuinely surplus? 9-17. Additional constraint checks covering store capacity, seasonal timing, product lifecycle stage, promotional conflicts, return-to-vendor eligibility, category balance, fixture capacity, regional clustering rules, and consolidation opportunity flagging.
Every SKU that does not result in a move receives a specific blocking reason code (37 possible codes). This is critical for transparency: you can always explain why a move was not recommended.
In the pilot, Pass 2 generated 1,443 inter-store moves -- the bulk of the engine's output.
Regional clustering
South African retail networks tend to cluster geographically: a group of stores in Gauteng, another in the Western Cape, a few in KZN. The engine exploits this structure by preferring intra-cluster transfers (lower freight cost, faster transit) over inter-cluster transfers (higher cost, slower).
This does not mean inter-cluster transfers never happen. When a style is dead in every Gauteng store but selling well in Cape Town, the engine will recommend the cross-country move -- but only when the GP uplift clearly exceeds the transport cost.
Pass 3: Sweep and consolidation
The third pass solves a problem that the first two passes can create: fragmentation.
The fragment problem
After Passes 1 and 2, some stores may end up with small quantities of stock that are not sellable on their own. Three units of a style in sizes 5, 8, and 11 is not a viable offering. No customer walks in wanting exactly those sizes.
Pass 3 identifies these fragments and consolidates them. The CONSOLIDATE_REGIONAL algorithm sweeps fragmented stock from multiple stores into a single location where the combined units form a sellable offering.
How consolidation works
The engine uses a bin-packing approach:
- Identify fragments. Scan all stores for SKU groups where the remaining stock does not form a viable size run.
- Find consolidation targets. Identify stores in each regional cluster that are the best candidates for receiving consolidated stock (based on demand profile, existing inventory, and logistics hub proximity).
- Evaluate consolidation moves. Each potential consolidation is evaluated for: combined size run viability, total freight cost of inbound transfers, expected sell-through at the consolidated location, and comparison against the alternative (marking down fragments in place).
- Execute or explain. Profitable consolidations are recommended. Unprofitable ones get a blocking reason explaining why the stock is better left in place or marked down.
Why this pass matters
Without consolidation, rebalancing creates a long tail of unsellable fragments. These fragments eventually get marked down, often at 50%+ discounts. Consolidation turns fragments into full-price selling opportunities.
Determinism: same input, same output
Every run of the Replenify engine is deterministic. Given identical input data, the engine produces identical output, every time. This is a deliberate architectural choice.
Why determinism matters
- Auditability. Every recommendation can be explained and verified. Retail buyers and merchandise planners need to trust the system. "The AI said so" is not an acceptable explanation in a business where a single bad decision can cost hundreds of thousands of rands.
- Reproducibility. If a result looks wrong, you can re-run it with the same inputs and get the same output. This makes debugging and validation possible.
- Determinism hash. Every engine run produces a hash that uniquely identifies the computation. You can verify that a result has not been tampered with or corrupted.
This is what we mean by "deterministic where it matters, AI-enhanced where it helps." The core engine that decides which stock to move is fully deterministic. AI enhances the intelligence layer -- demand forecasting, anomaly detection, natural language interaction through Reppi -- but the execution engine is transparent and verifiable.
Performance
In the pilot, the engine processed 28,348 inventory snapshot rows and generated 1,572 intelligent moves across all three passes in 265 milliseconds. This speed matters because it means the engine can run frequently -- daily or even intra-day -- adapting to demand changes in near real-time.
The analytics layer then tracks the actual GP impact of every move through closed-loop measurement. Not projections. Measured outcomes.
The full picture
The 3-pass architecture is one part of the Replenify platform. It feeds data into and receives intelligence from the broader module ecosystem:
- Initial Allocation uses the same demand signals and store intelligence
- Price Optimization uses sell-through data that rebalancing generates
- Replenishment builds on the demand patterns that rebalancing reveals
- Returns & Markdown uses the same routing intelligence for returns
- Assortment Planning closes the loop by feeding rebalancing insights into buying decisions
For a broader perspective on how rebalancing fits into the inventory management landscape, see our guide to inventory optimization.
The engine is live. The pilot data is real. If you want to see what 1,572 intelligent moves look like for your network, get in touch.