dima853

Posted on May 8

3.1 Prediction methods for conditional jumps

#programming #cpu

1. What is a saturating 2-bit counter?

This is a two-bit saturation counter, which is used in branch prediction algorithms. It helps the processor predict whether a conditional transition (for example, if-else, cycles) will be executed (taken) or not (not taken).

2. How does the saturating 2-bit counter work?

The counter can be in 4 states encoded with two bits:

Status	Value	Prediction
`00`	Strongly not taken	The transition will not be completed
`01`	Weakly not taken	Rather not executed, but may change
`10`	Weakly taken	Rather completed, but may change
`11`	Strongly taken	The transition will be completed

Tag update rules:

If the transition is completed (taken) → counter increases by 1 (but not higher than 11).
If the transition is not completed (not taken) → counter ** decreases by 1 (but not below 00)**.

Example:

The counter is in the 01 state (weakly not taken):
If a transition has occurred → goes to 10 (weakly taken).
- If the transition did not occur → goes to 00 (strongly not taken).

3. Why exactly 2 bits?

1- The bit counter (0 or 1) changes the prediction too abruptly and is subject to noise (frequent switching).
2- The bit counter adds "hysteresis": the transition between taken and not taken requires two erroneous predictions in a row, which makes the algorithm more resistant to random deviations.

4. Where is it used in processors?

Such counters are used in:

Transition History Tables (Branch History Table, BHT) – stores prediction states for different transition addresses.
Global and local prediction schemes (for example, in the Tournament Predictor algorithm for some processors).

5. Example of work

Let's say there is a loop that runs 9 times, and exits on the 10th time.:

The first iterations:
The counter starts at 00 (strongly not taken).
- The loop is running → the counter is gradually increased to 11 (strongly taken).
Last iteration (exit the loop):
The transition is not performed → the counter is reduced to 10 (weakly taken).
- If the cycle does not start anymore, the counter may drop to 00.

6. Optimizations related to saturating counters

Efficiency: Works well for stable branches (for example, loops with a large number of iterations).
Problems:
- Inefficient for random branches (if transitions are unpredictable, the counter will fluctuate frequently).
- Learning delay: It takes several runs for the counter to "learn" the correct prediction.

Conclusion

Saturating 2-bit counter is a simple but effective branch prediction mechanism used in many processors. It provides a balance between stability and adaptability, reducing the number of prediction errors (mispredictions).

Adaptive two-level predictor

1. What is Adaptive Two-Level Predictor?

This is an advanced branch prediction algorithm that uses two levels of information:

The first level is the history of previous transitions (branch history).
The second level is the state table (pattern history table), where each history template has its own saturating counter.

This approach allows to take into account correlations between successive transitions, which is especially useful for loops and complex conditional constructions.

2. How does Two-Level Predictor work?

2.1. Main components

Branch History Register (BHR)
is a register that stores the latest transition results (for example, taken = 1, not taken = 0).
- Example: BHR = 1011 means that the last 4 transitions were T, NT, T, T.
Pattern History Table (PHT)
A table of saturating counters (usually 2-bit), where each counter corresponds to a specific pattern from BHR.
- For example, if BHR = 1011, the processor looks at the PHT record for this template and uses it for prediction.
Global/Local History
- Global History (Global Two-Level) – One BHR for all transitions.
- Local History (Local Two-Level) – a separate BHR for each transfer address.

2.2. Work example

Let's say we have a loop that runs 3 times, and exits on the 4th time.:

Initialization:
- BHR = 0000 (if the history is 4-bit).
- In PHT, all counters are at 00 (strongly not taken).
First iterations (transition taken):
BHR shifts by adding 1: 0001 → 0011 → 0111 → 1111.
- For each template, the PHT is updated towards `taken'.
Last iteration (transition not taken):
BHR = 1111 → prediction taken (but actually not taken).
- The counter for 1111 is decreasing, and the prediction may change next time.

3. Why is Two-Level Predictor better than a simple saturating counter?

Takes into account the context of transitions:
A simple 2-bit counter "forgets" the previous states.
- Two-Level Predictor remembers transition sequences and predicts based on patterns.
Effective for cycles and periodic branching:
- For example, if the transition alternates T, NT, T, NT, BHR will help predict the next outcome.

4. Varieties of Two-Level Predictors

GAg (Global History, Global PHT)
One common BHR for all branches.
- Saves resources, but is less accurate for specific branches.
PAg (Per-Address History, Global PHT)
A separate BHR for each transfer address.
- More accurate, but requires more memory.
PAp (Per-Address History, Per-Address PHT)
Each branch has its own PHT.
- Maximum accuracy, but high difficulty.

5. Optimizations and challenges

5.1. Advantages

✅ Better accuracy than saturating counters.

✅ Effective for cycles and repetitive patterns.

5.2. Problems

❌ Requires more memory (BHR + PHT).

❌ Learning delay (it takes several iterations to set up).

❌ Conflicts in PHT (different branches may use the same counter).

6. Where is it used?

Intel Pentium Pro/II/III – two-level predictors were used.
Modern processors (AMD Zen, Intel Core) – combine two-level predictors with other methods (for example, TAGE predictor).

Conclusion

Adaptive Two-Level Predictor is a powerful branch prediction engine that takes into account the history of transitions to improve accuracy. It is especially useful in loops and complex conditional scenarios, but requires additional hardware resources.

1. The main idea of the Tournament Predictor

The processor uses two (or more) different predictors and dynamically selects which one is best suited for the current branch.

The main predictor (for example, gshare) works well for branches with global correlation.
An alternative predictor (for example, bimodal) is effective for simple branches.
Meta predictor (selector) – decides which of the two predictors to use.

2. How does the Tournament Predictor work?

2.1. Structure

Two predictors:
- P1 (for example, gshare) – takes into account the global history.
- P2 (for example, bimodal) is a simple 2-bit counter.
Meta-predictor table:
Stores 2-bit counters that decide which predictor (P1 or P2) to trust.

2.2. The algorithm of operation

When predicting branches:
P1 and P2 give their predictions.
- Meta-predictor chooses which one to use.
After completing the branch:
If the selected predictor was wrong and the alternative was right, the meta–counter is updated towards the alternative.
- If both were wrong or both were right, the meta counter does not change dramatically.

3. Why is Tournament Predictor effective?

Adaptability: If one predictor does not work well for a particular branch, the system automatically switches to another.
Versatility: Copes well with different types of branches (loops, random conditional jumps).
Reducing the number of mispredictions: The combination of methods gives a more stable result.

4. Example of work

Admit:

P1 (gshare) predicts taken.
P2 (bimodal) predicts not taken.
Meta-predictor tends towards P1 (for example, the counter 10 = weakly prefer P1).

Scenarios:

If the real transition is taken (P1 is right, P2 is wrong) → Meta-predictor enhances trust in P1 (10 → 11).
** If the real transition is not taken (P1 is wrong, P2 is right)** → Meta-predictor reduces confidence in P1 (10 → 01).

5. Where is it used?

Intel Pentium 4 (NetBurst) – used Tournament Predictor.
Modern processors (AMD Zen, Intel Core) – use sophisticated hybrid circuits (for example, TAGE + Loop Predictor).

6. Advantages and disadvantages

✅ High accuracy due to adaptive selection.

✅ Versatility – Suitable for different branching patterns.

❌ Complexity – requires additional tables and logic.

❌ Learning delay – The meta-predictor needs time to adjust.

Conclusion

Tournament Predictor (Agree Predictor) is a powerful hybrid algorithm that dynamically selects the best prediction method for each branch. It is especially useful in modern processors where it is important to minimize mispredictions.

Summary: Branch prediction mechanisms in processors

Method	Description	How does it work?	Positive	Minuses	Where is it used?
Saturating 2-bit Counter	The simplest predictor based on a 2-bit saturation counter.	- 4 states: `00` (strongly NT), `01` (weakly NT), `10` (weakly T), `11` (strongly T). - Increases with `taken`, decreases with `not taken`.	- Easy to implement. - Resistant to noise (hysteresis).	- Slowly adapts. - Predicts complex branches poorly.	A basic component of many predictors.
Two-Level Predictor	Uses the conversion history (BHR) and the template table (PHT).	- BHR stores the latest transition outcomes. - PHT contains 2-bit counters for each BHR template.	- Takes into account the context (better for loops). - More accurate than a 2-bit counter.	- Requires more memory. - Conflicts in PHT.	Intel P6 (Pentium Pro/II/III), AMD K8.
Tournament Predictor (Agree)	Hybrid predictor combining two methods (for example, gshare + bimodal).	- Meta-predictor selects which of the two predictors to use. - Updated based on errors.	- Adaptability. - High accuracy for different types of branches.	- Complex logic. - Additional hardware costs.	Intel Pentium 4, modern hybrid circuits.

Key terms

BHR (Branch History Register) is a register that stores the history of transitions.
PHT (Pattern History Table) is a table of counters corresponding to BHR patterns.
Meta-predictor is a mechanism for choosing between two predictors in Tournament Predictor.

The evolution of predictors

Saturating Counter → Two-Level → Tournament → TAGE (modern CPUs).
The more complex the algorithm, the higher the accuracy, but the higher the resource consumption.

Build seamlessly, securely, and flexibly with MongoDB Atlas. Try free.

MongoDB Atlas lets you build and run modern apps in 125+ regions across AWS, Azure, and Google Cloud. Multi-cloud clusters distribute data seamlessly and auto-failover between providers for high availability and flexibility. Start free!

Learn More

DEV Community

3.1 Prediction methods for conditional jumps

1. What is a saturating 2-bit counter?

2. How does the saturating 2-bit counter work?

Tag update rules:

3. Why exactly 2 bits?

4. Where is it used in processors?

5. Example of work

6. Optimizations related to saturating counters

Conclusion

Adaptive two-level predictor

1. What is Adaptive Two-Level Predictor?

2. How does Two-Level Predictor work?

2.1. Main components

2.2. Work example

3. Why is Two-Level Predictor better than a simple saturating counter?

4. Varieties of Two-Level Predictors

5. Optimizations and challenges

5.1. Advantages

5.2. Problems

6. Where is it used?

Conclusion

1. The main idea of the Tournament Predictor

2. How does the Tournament Predictor work?

2.1. Structure

2.2. The algorithm of operation

3. Why is Tournament Predictor effective?

4. Example of work

5. Where is it used?

6. Advantages and disadvantages

Conclusion

Summary: Branch prediction mechanisms in processors

Key terms

The evolution of predictors

Build seamlessly, securely, and flexibly with MongoDB Atlas. Try free.

Top comments (0)

Get Started