Skip to main content

Data Locality

Performance optimizationCachingData accessGame programmingMemory managementPerformanceAbout 3 min

Also known as

  • Cache-Friendly Design
  • Data-Oriented Design

Intent

The Data Locality design pattern aims to minimize data access times and improve cache utilization by arranging data in memory to take advantage of spatial locality. This pattern is particularly useful in high-performance computing and game development where access speed is crucial.

Explanation

Real-world example

Consider a supermarket where items are arranged based on purchase patterns and categories for efficiency. Just like the Data Locality pattern organizes data in memory for quick access, the supermarket places frequently bought items together and in easily accessible areas. This layout minimizes the time shoppers spend searching for items, enhancing their shopping experience by ensuring that related and popular items are close at hand, much like how data locality improves cache utilization and reduces access latency in computing.

In plain words

The Data Locality pattern organizes data in memory to reduce access times and improve performance by ensuring that data frequently accessed together is stored close together.

Programmatic Example

The Data Locality pattern is a design pattern that aims to improve performance by arranging data in memory to take advantage of spatial locality. This pattern is particularly useful in high-performance computing and game development where access speed is crucial.

In the data-locality module, the pattern is demonstrated using a game loop that processes a bunch of game entities. These entities are decomposed into different domains: AI, physics, and rendering.

The GameEntity class is the main class that represents a game entity. It contains an array of AiComponent, PhysicsComponent, and RenderComponent objects. These components represent different aspects of a game entity.

public class GameEntity {
    private final AiComponent[] aiComponents;
    private final PhysicsComponent[] physicsComponents;
    private final RenderComponent[] renderComponents;
// ...
}

The GameEntity class has a start method that initializes all the components.

public void start() {
  for (int i = 0; i < numEntities; i++) {
    aiComponents[i] = new AiComponent();
    physicsComponents[i] = new PhysicsComponent();
    renderComponents[i] = new RenderComponent();
  }
}

The GameEntity class also has an update method that updates all the components. This method demonstrates the data locality pattern. Instead of updating all aspects of a single entity at a time (AI, physics, and rendering), it updates the same aspect (e.g., AI) for all entities first, then moves on to the next aspect (e.g., physics). This approach improves cache utilization because it's more likely that the data needed for the update is already in the cache.

public void update() {
  for (int i = 0; i < numEntities; i++) {
    aiComponents[i].update();
  }
  for (int i = 0; i < numEntities; i++) {
    physicsComponents[i].update();
  }
  for (int i = 0; i < numEntities; i++) {
    renderComponents[i].update();
  }
}

The Application class contains the main method that creates a GameEntity object and starts the game loop.

public class Application {
  public static void main(String[] args) {
    var gameEntity = new GameEntity(NUM_ENTITIES);
    gameEntity.start();
    gameEntity.update();
  }
}

In this way, the data-locality module demonstrates the Data Locality pattern. By updating all components of the same type together, it increases the likelihood that the data needed for the update is already in the cache, thereby improving performance.

Class diagram

alt text
Data Locality pattern class diagram

Applicability

This pattern is applicable in scenarios where large datasets are processed and performance is critical. It's particularly useful in:

  • Game development for efficient rendering and physics calculations.
  • High-performance computing tasks that require rapid access to large data sets.
  • Real-time data processing systems where latency is a critical factor.

Known Uses

  • Game engines (e.g., Unity, Unreal Engine) to optimize entity and component data access.
  • High-performance matrix libraries in scientific computing to optimize matrix operations.
  • Real-time streaming data processing systems for efficient data manipulation and access.

Consequences

Benefits:

  • Improved Cache Utilization: By enhancing spatial locality, data frequently accessed together is stored close together in memory, improving cache hit rates.
  • Reduced Access Latency: Minimizes the time taken to fetch data from memory, leading to performance improvements.
  • Enhanced Performance: Overall system performance is improved due to reduced memory access times and increased efficiency in data processing.

Trade-offs:

  • Complexity in Implementation: Managing data layout can add complexity to the system design and implementation.
  • Maintenance Overhead: As data access patterns evolve, the layout may need to be re-evaluated, adding to the maintenance overhead.
  • Less Flexibility: The tight coupling of data layout to access patterns can reduce flexibility in how data structures are used and evolved over time.

Credits