Optimization and decision making for building energy efficiency strategies

Climate-adaptive building design through explainable AI: 3D spatial layout automation and evolutionary optimization across climate zones

Peiying Huang, Yanxiang Yang, Wen Gao, Xing Zheng, Pengyuan Shen

2026

Journal of Building Engineering

Climate-adaptive building design through explainable AI: 3D  spatial layout automation and evolutionary optimization across  climate zones

Fig. 3. Illustration of spatial layout growth process: (a) Initial state with starting units (b) Parameter structure and data flow (c) Zone growth progression.

Summary

This study introduces a novel framework for climate-adaptive building design by integrating explainable AI (XAI) with evolutionary optimization. It automates the generation of 3D spatial layouts tailored to diverse climate zones, balancing energy efficiency and thermal comfort. By employing SHAP analysis, the model deciphers complex non-linear relationships between morphological parameters and performance, providing architects with interpretable design rules. The framework demonstrates that optimized, climate-responsive layouts can significantly reduce energy demand, offering a robust tool for data-driven architectural decision-making.

Abstract

Spatial layout significantly impacts building energy performance, yet systematic optimization methods across different climates remain limited. This research develops an integrated threestage framework combining automated layout generation, evolutionary optimization, and explainable artificial intelligence (XAI) to reduce energy consumption in mixed-use office buildings. Using a typical eight-story office building, we conducted comparative analysis across five Chinese climate zones: severely cold (Harbin), cold (Beijing), hot summer-cold winter (Shanghai), subtropical (Shenzhen), and mild (Kunming). In Stage 1 - Layout Generation, grid based algorithms with geometric constraints automatically generate energy-efficient spatial configurations. In Stage 2 -Optimization, evolutionary algorithms (SPEA-2 and HypE) integrated with building energy simulation minimize cooling and heating loads, generating over 1700 optimized solutions per climate zone. In Stage 3 - XAI Interpretation, random forest models predict energy performance with high accuracy (R2 = 0.801-0.874), while SHAP analysis quantifies the contribution of 26 spatial layout features. Results demonstrate substantial energy savings potential. Subtropical climate (Shenzhen) achieves the best absolute performance with 17.25 % reduction in total loads, while mild climate (Kunming) shows the highest percentage reduction at 24.91 %. Average energy savings across all climate zones range from 9.67 % to 13.60 % for heating-dominated regions. SHAP analysis reveals climate-specific design strategies. It is found that orientation area distribution is the most critical factor for subtropical climates, while space centralization and space adjacency optimization are essential for cold regions. This methodology provides architects and engineers with computationally efficient, evidence-based tools for climate-adaptive sustainable building design during early planning stages.

1. Introduction

The construction industry plays a significant role in global energy consumption in the face of the global climate change challenge. According to United Nations Environment Programme (UNEP) and the Global Construction Alliance, the global construction industry accounted for 32%3 2 \% of total global energy consumption and 34%3 4 \% of global carbon emissions as of 2023 [1]. With rising living standards globally, building energy consumption continues to grow, making energy efficiency interventions increasingly urgent. Meanwhile, IPCC suggested that the construction sector has great potential for emission reduction and relatively lower costs [2]. This combination of high emissions contribution and significant reduction potential positions the construction sector as a critical lever for achieving carbon peak and carbon neutrality goals. Building energy efficiency is an overall optimization issue that requires comprehensive consideration and collaboration from multiple fields through the entire design process, which needs to balance influencing factors such as passive and active design strategies. According to the ANNEX-30 project study by the International Energy Agency, the performance of buildings is largely influenced by the early design stage, and decisions made in the early design stage have more than 40%4 0 \% potential for energy savings [3]. Therefore, understanding and optimizing early-stage design decisions is fundamental to achieving substantial energy reductions in the built environment. Among the various design factors affecting building energy performance, building envelope and shape have been extensively studied as primary determinants. Building envelope performance, including thermal insulation properties, window-to-wall ratios, and material selection, directly influences heat transfer between interior and exterior environments [4–7]. Evolutionary optimization approaches have been widely applied to envelope design across different climate and seismic zones, demonstrating significant potential in energy saving and carbon emission reduction [8].

Recent advances in AI-based and optimization-driven approaches also have significantly expanded the scope of computational building design. Advanced computational methods, including artificial neural network-based genetic algorithms, have proven effective in optimizing building geometry for thermal energy efficiency in public buildings [9]. Parametric design frameworks integrated with genetic algorithms have been applied to optimize climate-responsive building passive strategies [10]. On urban scale, they have also been applied to optimize urban morphology, building density, and street configurations for energy efficiency and environmental performance [11]. Deep learning methods, particularly generative adversarial networks, have demonstrated capability in predicting urban-scale energy consumption patterns and generating energy-efficient building forms [12]. Reinforcement learning approaches have been employed for real-time building energy management and HVAC control optimization [13]. Multi-objective optimization frameworks combining building performance simulation have enabled optimization of building energy systems [14].

Building shape factors, such as aspect ratio, compactness, and surface-to-volume ratio, substantially affect energy consumption patterns [15,16]. However, beyond envelope and shape optimization, studies in the past decade have revealed that building spatial

layout, which refers to the internal arrangement of functional spaces, represents another critical yet understudied dimension of early-stage energy efficiency design. It is found that effective spatial layout design can reduce unnecessary energy consumption and improve the overall sustainability of the building [17]. Hence, the impact of building spatial plans on energy consumption is a key factor in energy conservation design, especially in the early stages of building design [18–20]. Conventional design methods have struggled to meet complex optimization requirements, while computational design and machine learning models, as emerging tools, have provided important support for building performance optimization [21]. Computational design, through computer simulation and optimization algorithms, predicts the performance of the building layout in the early design stage and can evaluate the performance of different schemes in terms of energy efficiency, natural lighting, thermal comfort, etc. [22,23]. This computational design-based approach can improve design efficiency, guide design decisions, and provide an important basis for building energy conservation. Meanwhile, the machine learning methods can effectively facilitates the prediction of complex relationship between building layout and energy consumption by analyzing historical data [24]. Compared with traditional physical simulation, machine learning can handle more variables, perform evolutionary optimization, and propose optimal design schemes, especially under different climatic conditions [25]. These models provide data-driven decision support for design teams to help achieve energy-efficient design and emission reduction targets. By combining computational design with machine learning, architectural design can achieve an integrated optimization of multiple goals such as energy efficiency, comfort, and environmental adaptability. The emerging methods now equip designers with better approaches in rapid decision-making and drive progress in sustainable development and carbon reduction efforts in the construction industry.

The remainder of this paper is organized as follows: Section 2 presents a comprehensive literature review examining the impact of spatial layout on building energy performance, climate-adaptive design strategies, and the application of simulation-based optimization and machine learning methods in building performance research. Section 3 describes the research methodology and framework, including the automated spatial layout generation algorithm, machine learning-based energy prediction model, spatial layout design variables and feature engineering, optimization problem formulation, and case study building configuration across five climate zones. Section 4 presents and discusses the results, analyzing climate-specific impacts of spatial layout on building energy performance, validating the machine learning model performance, and providing interpretability analysis using explainable AI to reveal design mechanisms. Section 5 concludes the paper by summarizing key findings, discussing climate-specific design strategies, acknowledging research limitations, and suggesting directions for future work.

2. Literature review

Under the condition of a fixed building form, spatial layout inside the building is shown to exert significant impact on energy consumption [17]. Du et al. found that an office building in Sweden can reduce its annual heating and cooling energy demands by 14%1 4 \% and 57%5 7 \% respectively by changing spatial layout, while a certain office building in the UK reduced its peak lighting demands by 67%6 7 \% and 43%4 3 \% respectively by changing the layout [26]. In addition, Du Tiantian et al. took an office building as an example and proposed 11 different spatial layout schemes under a fixed building profile [18]. They investigated the building performance of the research object in three different climates (temperate, cold and tropical) and three typical cities (Amsterdam, Harbin and Singapore) and conducted lighting and energy consumption simulations. The results show that under a fixed building profile, Optimizing spatial layout scheme can significantly reduce the energy consumption of buildings. Therefore, it is evident that even within the same building outline, a reasonable spatial layout of the building can effectively improve the overall energy consumption performance of the building.

The influence of spatial layout on building performance is mainly reflected in multiple aspects such as the organization mode of cooling and heating needs, the coordination degree between building orientation and natural lighting resources, the formation of ventilation paths and the influence on natural ventilation efficiency, as well as the spatial integration relationship between different usage periods and energy consumption patterns. For example, when high-energy-demand spaces are concentrated in the core area of a building far from the envelope, heat transfer loss can be effectively reduced [27]; Placing office areas with high daylight resource not only improves the quality of lighting but also reduces lighting energy consumption [28]; In addition, a well-organized flow line can enhance the efficiency of natural ventilation and reduce reliance on mechanical systems. If the operating hours of different spatial layouts and the pattern of heat load changes can be matched with each other, it will also help improve the operational efficiency of the building system [29]. The mutual coupling and dynamic interaction of these factors in architectural space make the influence of spatial layout on the energy performance of buildings show a high degree of complexity and adaptive differences [30].

Moreover, in the context of global climate change, passive measures are essential to improve the energy performance and climate adaptability of buildings [6]. The rational choice of design strategies has significant regional characteristics in different climate conditions, and the same type of energy efficiency measures may even have opposite results in different climate zones. Therefore, formulating building design strategies based on local conditions is the key path to achieving building energy conservation goals. Taking the insulation performance of the envelope as an example, it is considered crucial to enhance the insulation effect of the envelope in temperate climates, and this view has been verified in southern Chile, but in some other regions, increasing insulation poses the risk of overheating [31,32]. It is also emphasized by previous research that the applicability of passive measures is closely related to climatic characteristics, even with changing future climate conditions [6].

Similarly, the overall building spatial layout also has different influence mechanisms on building energy consumption in different climate zones. Recent studies have conducted case studies and analyses based on specific climatic conditions and specific building types. In cold regions of China, Shi et al. investigated the building layout and energy consumption of 30 public hospitals in cold regions of China, classified the layout patterns of outpatient and inpatient departments, analyzed and compared with energy consumption

data, and found that among the sampled hospital cases, the general outpatient department had the highest energy-saving rate of 16.3%1 6 . 3 \% by using the grid-like courtyard layout. The "L" layout in the inpatient department achieved the highest energy efficiency (5.9%)\left( 5 . 9 \% \right) [33]. Dai et al. investigated the impact of university building layouts on energy performance in the cold and dry Xinjiang region. They conducted energy consumption simulation on five typical layouts of individual university buildings using EnergyPlus, and the results showed that in the cold region, the lower zone atrium layout consumed more energy than the intra-zone corridor and single-side corridor layouts [34]. In the cold region, Cheng et al. used DesignBuilder software to simulate the energy consumption of six typical rural residential layouts and found that the building layout had a greater impact on heating energy consumption and a smaller impact on cooling energy consumption, among which the rectangular building layout had the best energy-saving effect [35].

In addition to studies in a single climate zone, there were also studies comparing the effects of building geometry and layout on energy consumption under different climate conditions. Irina Susorova et al. examined the impact of building and window geometry parameters on energy consumption and energy savings in office buildings and found that in tropical climates, medium and large window areas with a window-to-wall ratio of (WWR) 5080%5 0 \mathrm { - } 8 0 \% in medium and high depth rooms (9–15 m) could achieve maximum energy savings; In temperate climates, medium and high depth rooms (615m)\left( 6 - 1 5 \mathrm { m } \right) ) achieved better energy savings with medium and large window areas (WWR 5060%5 0 { \mathrm { - } } 6 0 \% ). In cold climates, energy savings mainly occurred with small window areas (WWR 2030%2 0 { - } 3 0 \% ) in shallow rooms (6 m) and medium and high depths (9–15 m) in south-facing rooms. Medium and large window sizes (WWR 5080%)5 0 \mathrm { - } 8 0 \% ) ), while south-facing rooms generally have better energy performance in all climates [36]. Du et al. analyzed the impact of spatial layout on building energy demand under three climatic conditions: Amsterdam, Harbin, and Singapore, and found that in temperate climates, spatial layout had the highest impact on energy performance, especially in terms of lighting requirements; In cold climates, the impact of spatial layout on energy performance is relatively small; In tropical climates, spatial layout has the least impact on building energy performance [18]. There are significant differences in response mechanisms to building spatial layout across different climate zones. These differences are mainly influenced by a combination of solar radiation intensity, temperature and humidity conditions, ventilation potential, and the type of heat load dominant (heating or cooling). Therefore, exploring climate-sensitive spatial layout strategies is an important direction for promoting the construction of energy-efficient and climate-adaptive design systems in buildings. A comprehensive overview of the literature on building spatial layout and building performance related domains is presented in Table 1.

To sum up, climate factors play a crucial role in the impact of building layout on energy performance. Only by combining specific climatic conditions can one optimize spatial layout to effectively improve building energy performance. Nevertheless, compared to other architectural design elements, there are relatively few specific studies on the energy performance of building spatial layout under different climatic conditions. At present, most of the research on spatial layouts is focused on cold climate zones, while research on other climate zones is relatively scarce. In these case studies, many are based on specific building types, first summarizing and classifying existing buildings, and then conducting in-depth analyses of typical layouts. While this approach can provide more specific and targeted research results, it also has obvious limitations. For example, due to the limited number of cases of research subjects, the conclusions drawn may not have broad applicability. At the same time, in real-world case studies, it is difficult to completely eliminate the influence of other possible interfering factors on the research results. It is worth noting that the effects of various elements in building spatial layout on energy performance are complex and interrelated. However, most current studies have not conducted quantitative analyses of these influencing factors to clearly explore their specific relationship with building energy performance.

On the other hand, traditional research on building energy conservation mostly focuses on single-objective optimization or empirical rules, making it difficult to systematically deal with a large number of high-dimensional design variables and their complex combination relationships involved in building spatial layouts. In recent years, building energy consumption simulation, as one of the most core supporting technologies for achieving performance-driven optimization, has been playing an increasingly crucial role in multi-scale and multi-stage building carbon reduction practices [37]. Meanwhile, simulation-based optimization has shown unique advantages in dealing with the discontinuity, multimodal characteristics, target conflicts and uncertainties of building optimization problems [38].Among them, simulation optimization methods represented by evolutionary algorithms (such as NSGA-II, MOEA/D, NSDE, etc.) are widely used in building performance evaluation [38,39]. This method can seek a balance among multiple building performance objectives, generate Pareto optimal solution sets, and provide a technical path for multi-dimensional co-optimization of building performance. Research on the combination of building energy consumption simulation and evolutionary algorithms has achieved remarkable results in various types of buildings, including residential and office buildings [40–43]. These tools enable efficient construction, iteration and evaluation of design schemes in the simulation-optimization cycle.

Nonetheless, a number of key research gaps in terms of comprehension and the optimization of buildings spatial layouts in different climatic environments exist despite these technological improvements. The available studies are largely single-climate, or on case studies, with no systematic cross-climate comparative analysis employing uniform methodologies. Although computational optimization and machine learning have demonstrated sufficient capabilities in their respective domains, they have not yet been fully deployed, at least when applied to spatial layout problems, to exploit explainable AI methods to give interpretable interpretations of the complex layout-energy relationships. In addition, the literature on predetermined layouts tends to be based on ad hoc layouts with no formal quantification of the relative significance of the spatial variables and no investigation of the non-linear interactions of variables at a range of climate conditions. Despite these technological improvements in spatial layout research, several critical research gaps remain:

Gap 1: Lack of systematic cross-climate comparative analysis. Existing spatial layout studies are predominantly single-climate investigations or building-specific case studies (e.g., Shi et al. [33] focusing solely on cold regions, Dai et al. [34] examining only Xinjiang climate). No systematic cross-climate comparative analysis employing uniform methodologies, consistent building typologies, and standardized evaluation metrics has been conducted to identify both universal design principles and climate-dependent strategies.

Table 1 Summary of literature on building spatial layout and building performance related studies.

AuthorsYearStudy FocusClimate/LocationBuilding TypeKey FindingsMethodologyLimitations
Hemsath et al. [15]2015Building geometry effects on energy useGeneralGeneric buildingsAspect ratio and volume stacking reduce energy consumptionSensitivity analysisLimited to geometric parameters
Du & Tiantian [26]2020Space layout effects on energy demandSweden, UKOffice buildings14-67 % reduction in heating, cooling, and lighting demandsCase study analysisSpecific cases, small sample
Du et al. [18]2021Space layout impact across climatesAmsterdam, Harbin, SingaporeOffice buildingsTemperate: highest impact; Cold: small impact; Tropical: least impactEnergy simulation11 predefined layouts only
Shi et al. [33]2021Hospital building layout and energy consumptionCold regions of ChinaPublic hospitalsGrid courtyard: 16.3 % savings; "L" layout: 5.9 % savingsStatistical analysisBuilding-specific, single climate
Dai et al. [34]2019University building layout effectsXinjiang, ChinaUniversity buildingsAtrium layouts consume more energy than corridor layoutsEnergyPlus simulation5 layouts, single climate
Cheng et al. [35]2019Rural residential layout optimizationCold region of ChinaRural residencesRectangular layout optimal; heating > cooling impactDesignBuilder simulationRural-specific, limited variations
Susorova et al. [36]2013Building and window geometry parametersTropical, temperate, coldOffice buildingsClimate-specific WWR: 50-80 % (tropical), 50-60 % (temperate), 20-30 % (cold)Parametric analysisWindow focus, not comprehensive

Gap 2: Limited application of explainable AI to spatial layout optimization. Although computational optimization and machine learning have demonstrated capabilities in building performance prediction, they have not been fully deployed in spatial layout problems to provide interpretable explanations of complex layout-energy relationships. Most studies treat optimization algorithms and machine learning models as "black boxes," offering optimized solutions without revealing the underlying design mechanisms or the relative importance of different spatial variables.

Gap 3: Absence of formal quantification of spatial layout variables. The literature on spatial layouts predominantly relies on predetermined, ad hoc layout configurations (e.g., Du et al. [18] examined limited number of predefined layouts) without formal quantification of spatial variables such as concentration/dispersion patterns, orientation distributions, and adjacency relationships. This limits understanding of which specific spatial characteristics drive energy performance and how these characteristics interact non-linearly under different climate conditions.

Gap 4: Limited integration of automated generation with optimization. While evolutionary algorithms have been successfully applied to envelope and form optimization, their application to spatial layout generation remains limited. Most spatial layout studies evaluate manually designed alternatives rather than employing automated generation methods capable of systematically exploring vast design spaces while satisfying complex geometric, functional, and regulatory constraints.

Compared to the existing studies, this research compensates for the mentioned gaps by creating a coherent research framework integrating the automated generation of spatial layouts and evolutionary optimization with explainable machine learning analysis. We developed an automated 3D spatial layout generation method using grid-based algorithms implemented in Rhino-Grasshopper with customized Python code, enabling systematic exploration of energy-efficient design alternatives. The generated layouts are optimized using evolutionary algorithms (SPEA-2 and HypE) integrated with building energy simulation (Honeybee/EnergyPlus) across five Chinese climate zones: severely cold (Harbin), cold (Beijing), hot summer-cold winter (Shanghai), subtropical (Shenzhen), and mild (Kunming). Random forest regression models predict energy performance from 26 spatial layout features with high accuracy (R2>( \mathrm { R } ^ { 2 } > 0.87), while SHapley Additive exPlanations (SHAP) analysis quantifies feature contributions to reveal climate-specific design mechanisms. This systematic cross-climate comparison using consistent building typology and evaluation metrics demonstrates 9.67 ‰9 . 6 7 ~ \text{‰} 24.91 % energy savings and identifies that orientation distribution optimization is critical for subtropical climates while space centralization is essential for cold regions, providing architects and engineers with evidence-based tools for climate-adaptive sustainable building design during early planning stages.

3. Methodology and research framework

3.1. Overall research framework and workflow

This study explores the impact of different spatial layouts on building energy consumption by constructing a research framework combining evolutionary optimization and proxy models. The overall process includes three stages as shown in Fig. 1, including data preparation, layout generation and optimization, and energy consumption analysis. Data preparation includes spatial layout requirements, building exterior profile shapes, and typical weather documents. Space requirements define the area requirements for each space inside the building, providing a basis for generating the spatial layout; The outline shape of the building, as a geometric constraint, limits the spatial range for layout optimization. In the optimization generation stage, the 3D building spatial layout generation method was used to achieve automatic generation and optimization evaluation of energy-saving oriented spatial layout through energy-saving oriented evolutionary algorithms. Based on a fixed building profile, the tool generates multiple sets of spatial layout schemes that meet spatial layout requirements and have low energy consumption performance through iterative calculations, laying the foundation for further analysis of the simulation data. In the energy consumption analysis phase, an efficient energy consumption prediction proxy model was constructed using the random forest model and combined with interpretable artificial in telligence technology (SHAP value analysis) to reveal the complex relationship between spatial layout and building energy consumption. Through model analysis and statistical summary, the study identified the specific impact of different spatial layouts on building energy consumption under multiple climatic conditions.


Fig. 1. Diagram of the framework of the paper.

3.2. Automated spatial layout generation method

This study develops an automated spatial layout generation method that combines inverse workflow design principles with computational optimization to generate energy-efficient building layouts. The method operates on the Rhino-Grasshopper platform and utilizes custom Python algorithms to systematically explore the design space while satisfying both space use requirements and energy performance objectives under multiple design constraints.

3.2.1. Inverse design workflow framework

The work implements an inverse workflow approach where energy performance targets and spatial requirements drive the design process, rather than traditional forward design methods that evaluate performance after layout creation. This approach consists of three core stages: (1) establishing energy performance requirements and spatial programming constraints, (2) applying generative algorithms to automatically produce layout configurations, and (3) optimizing generated layouts through energy simulation feedback to identify optimal solutions. The inverse methodology enables direct exploration of energy-efficient design alternatives, significantly improving computational efficiency compared to conventional trial-and-error approaches.

3.2.2. Design constraints and generation rules

The spatial layout generation method operates under multiple constraint categories that ensure generated layouts meet both space use and regulatory requirements. Geometric constraints define the spatial boundaries and structural limitations, including building envelope boundaries, column grid alignment requirements, and floor height restrictions. The method enforces strict adherence to the building’s external contour while maintaining compatibility with the structural system. Functional constraints ensure that generated layouts satisfy programmatic requirements and operational needs. These include minimum and maximum area requirements for each zone, defined as:

Amin(z)Ag e n e r a t e d(z)Amax(z)A _ {\min } (\mathbf {z}) \leq A _ {\text {g e n e r a t e d}} (\mathbf {z}) \leq A _ {\max } (\mathbf {z})

where Agenerated(z)A _ { g e n e r a t e d } ( z ) represents the actual generated area for zone z. Additionally, space use constraints encompass adjacency requirements between specific zones, accessibility standards for circulation paths, and floor assignment restrictions for certain spaces.

Connectivity constraints maintain spatial coherence and ensure proper circulation throughout the building. The method enforces zone contiguity requirements, preventing fragmented spaces that could compromise operational efficiency. Vertical circulation accessibility is mandatory for all zones, with the algorithm verifying that each space maintains connection to primary circulation systems. The connectivity validation function can be expressed as:

Connectivity(z)=uzAdjacent(u,c i r c u l a t i o n)1\operatorname {C o n n e c t i v i t y} (z) = \sum_ {u \in z} \operatorname {A d j a c e n t} (u, \text {c i r c u l a t i o n}) \geq 1

where u represents individual spatial units within zone z, and Adjacent(u, circulation) evaluates proximity to circulation systems.

Regulatory constraints ensure compliance with building codes and safety requirements, including minimum egress path widths, maximum travel distances to exits, and fire separation requirements between specific zones. The method incorporates these constraints through rule-based validation systems that continuously monitor generated layouts against established criteria. These constraints are parameterized through: building contour polylines and column grid spacing for geometric boundaries; space requirements tables with area bounds and adjacency matrices for functional requirements; graph-based representations for connectivity validation; and rulebased functions for regulatory compliance based on Chinese building codes.

3.2.3. 3D spatial layout generation algorithm

3.2.3.1. Grid-based spatial representation. The generation process begins with a grid-based spatial representation system that dis cretizes the building volume into manageable spatial units. To accommodate irregular building geometries, the algorithm employs a


Fig. 2. Ideal mesh model for buildings.

two-stage grid projection method. First, an ideal orthogonal 3D grid is established based on the building’s structural column network, providing a systematic framework for spatial organization. This ideal grid is then projected onto the actual building geometry, transforming regular spatial units into building-specific volumes while preserving spatial relationships and adjacency requirements, which is shown in Fig. 2.

The transformation process maintains spatial continuity through geometric mapping spaces that preserve topological relationships between adjacent units. For a spatial unit Ui,j,kU _ { i , j , k } in the ideal grid at coordinates (i,j,k)( i , j , k ) , the corresponding projected unit Ui,j,kU _ { i , j , k } in the irregular building form is calculated using:

Uii,j,k=T(Ui,j,k,Gb u l d i n g)U ^ {i} \mathrm {i}, j, k = T (U i, j, k, G _ {\text {b u l d i n g}})

where T represents the transformation function and GbuildingG _ { b u i l d i n g } defines the building’s geometric constraints. This approach ensures that spatial allocation logic remains consistent regardless of building form complexity.

3.2.3.2. Zone growth algorithm. The spatial layout generation employs a zone growth algorithm that simulates the organic expansion of spatial layout from designated starting points. The algorithm operates on two parameter sets: fixed parameters defining space use requirements (zone, area_demand, area_tolerance, floor) and variable control parameters governing growth patterns (start_unit, step_len, direction).

The growth process follows an iterative expansion mechanism where spaces expand from initial seed units according to specified growth rules. For each zone space z, the algorithm calculates the current area Acurrent(z)A _ { c u r r e n t } ( z ) and compares it against the required area Ademand(z)A _ { d e m a n d } ( z ) with tolerance τ:

Growth_Complete(z)=Acurrent(z)Ademand(z)×G r o w t h \_ C o m p l e t e (z) = A _ {c u r r e n t} (z) \geq A _ {d e m a n d} (z) \times

The zone expansion follows directional priorities defined by the direction parameter, with growth step lengths controlled by step_len values. Adjacent vacant units are systematically incorporated into expanding zones based on connectivity rules and spatial constraints. The generation process is illustrated and visualized in Fig. 3.

3.2.4. Algorithm implementation

The complete layout generation process is formalized through two complementary algorithms.

Algorithm 1. - Main Layout Generation: This algorithm coordinates the overall generation process, beginning with vertical circulation core establishment through create_transport_plan(.), followed by core vertical extension via create_vertical_mass(.). Functional zones are then simultaneously grown using grow_program(.), with any remaining vacant spaces filled through fill_program(.). The pseudocode of Algorithm 1 can be found in Appendix 1.

Algorithm 2. - Iterative Growth Process: This algorithm manages the stepwise expansion of individual zones. The process


Fig. 3. Illustration of spatial layout growth process: (a) Initial state with starting units (b) Parameter structure and data flow (c) Zone growth progression.

initializes with starting point assignment based on gene data, then iteratively expands zones according to directional sequences and step lengths while monitoring area constraints and adjacency requirements. The pseudocode of Algorithm 2 can be found in Appendix 2.

The growth termination criteria can ensure that zones achieve required areas within specified tolerances or exhaust available expansion opportunities. When a zone cannot continue growing from its current configuration, the algorithm selects new vacant units as alternative starting points, ensuring comprehensive space utilization. The constraint validation system operates continuously during the generation process, rejecting invalid configurations immediately and redirecting the algorithm toward feasible solutions. The rapid generation capability of the proposed method enables extensive design space exploration within practical time constraints. The algorithm successfully handles both regular orthogonal building forms and irregular geometric configurations while maintaining space connectivity and spatial coherence under all imposed constraints.

3.3. Machine learning-based energy prediction model

Building energy consumption analysis involves complex, multidimensional variables with highly nonlinear relationships that traditional analytical methods struggle to capture effectively. This study develops a comprehensive machine learning-based prediction framework combining random forest regression with explainable artificial intelligence (XAI) techniques to provide both accurate predictions and interpretable insights into spatial layout-energy performance relationships.

3.3.1. Random forest regression model

Random forest, an ensemble learning technique widely used for regression analysis [44], demonstrates superior performance in building energy consumption prediction due to fewer parameters, stronger generalization ability, and exceptional resistance to overfitting compared to other methods [45,46]. The algorithm creates T decision trees trained on bootstrap samples of the original data, with final energy consumption predictions calculated as:

y^=1Tt=1Tht(x)\widehat {y} = \frac {1}{T} \sum_ {t = 1} ^ {T} h _ {t} (x)

where ht(x)h _ { t } ( x ) represents the prediction from the t-th tree for input features x. This averaging process reduces variance and improves prediction stability while naturally handling missing values and mixed data types with minimal hyperparameter tuning requirements.

3.3.2. Explainable AI (SHAP) analysis

To address the "black box" nature of machine learning models [47,48], this study employs SHAP (SHapley Additive exPlanations) analysis [49–51] based on cooperative game theory concepts. SHAP quantifies each input variable’s importance to model predictions by decomposing outputs into additive feature contributions:

f(x)=ϕ0+i=1mϕi(x)f (x) = \phi_ {0} + \sum_ {i = 1} ^ {m} \phi_ {i} (x)

where Φ0\Phi _ { 0 } represents expected model output over the baseline dataset, and ϕi(x)\phi _ { i } ( x ) denotes the SHAP value for feature i. SHAP provides both global interpretability (overall feature importance patterns across the dataset) and local interpretability (explanations for individual predictions) [52,53]., enabling architects to understand how specific layout configurations achieve their energy performance outcomes through multiple visualization techniques including feature importance plots, dependence plots, and waterfall plots.

By combining random forest modeling with XAI technology, this framework not only predicts building energy consumption but also explains prediction results, enabling architects to understand the specific impact of design decisions on energy efficiency and providing reliable tools for optimizing building performance during early design stages.

Table 2 Characteristics and abbreviations of the impact of building layout on building energy consumption performance.

CharacteristicsOfficeMeetingsCafeteriaServiced apartment
Number of floor areasOCMCCCAC
North-facing areaONMNCNAN
East-facing areaOEMECEAE
South-facing areaOSMSCSAS
West facing areaOWMWCWAW
Office proximity area-O-MO-CO-A
Adjacent area of the meetingO-M-M-CM-A
Area adjacent to the cafeteriaO-CM-C-C-A
Hotel-style apartment adjacent areaO-AM-AC-A-

3.4. Spatial layout design variables and feature engineering

To gain a more detailed understanding of the mechanism by which spatial layout affects building energy consumption, this paper quantitatively assesses the impact of spatial layout on building energy consumption through three aspects: the concentration and dispersion of space, the orientation of space, and the proximity between spaces. A total of 26 features related to building layout in three major categories were recorded in each individual energy consumption simulation in the iterative optimization to quantify the factors that affect building layout on building heat load. The three categories are: the number of planar areas of each space, the facing area of each space, and the adjacent area between each space. The specific factors in each category and their English abbreviations are shown in Table 2.

When preparing for the spatial layout generation, each space is divided to be concentrated or dispersed, and the number of planar areas for each space is used to quantitatively study the influence of this part. When a certain type of space is arranged in more dispersed areas, the number of its planar areas is greater; conversely, it is arranged in a more concentrated manner.

The space orientation is used to quantitatively assess the impact of the orientation of each space on the building’s heat load. Since the overall orientation of the building is due south, the orientation area of the space is determined by the sum of the facade areas corresponding to its direction. In the case shown in Fig. 4, if we suppose this is A floor plan of a single-story building with a height of 3m, then the north-facing area of space A is 3m2;3 \mathrm { m } ^ { 2 } ; , the east and south facing areas are 0, and the west facing area is 3m23 \mathrm { m } ^ { 2 } .

The adjacent areas between each space are used to quantitatively assess the adjacency between different spaces. The data for this feature is obtained by calculating the sum of coplanar areas between each space. In the case shown in Fig. 4, suppose this is A floor plan of a single-story building with a floor height of 3m, then the planar adjacent area between space A and space B is 6m26 \mathrm { m } ^ { 2 } . The calculation of such factors only takes into account the proximity within the same floor in the horizontal direction, not the proximity between different floors in the vertical direction.

3.5. Multi-objective optimization problem formulation

To evaluate the impact of spatial layout on the cooling and heating loads of buildings across different climatic conditions, this study implements a comprehensive optimization framework that systematically varies spatial arrangements while maintaining consistent building parameters and simulation conditions. The optimization targets the minimization of annual average area unit cooling and heating loads while ensuring compliance with spatial programming requirements.

3.5.1. Case study building configuration

A typical office building model was established to represent archetypical medium-rise office as the research object for building performance simulation [54]. The mixed-use office building has 8 floors with a height of 3m per floor, covering a standard floor area of 1,536 m2^ { 1 , 5 3 6 ~ \mathrm { m } ^ { 2 } } and a total building area of 12,288m2^ { 1 2 , 2 8 8 \mathrm { m } ^ { 2 } } . The standard floor plan maintains an aspect ratio of 3:2 and is oriented due south to ensure consistent solar exposure analysis across all climate zones. The building form, as shown in Figs. 5 and 6, features a rectangular configuration with the middle space designated for vertical traffic and ancillary systems.

The case study office building accommodates four primary spaces of use: office area, meeting area, cafeteria, and hotel-style apartment. The specific area allocations and operational parameters for each space are detailed in Table 3 and the general setups for building are listed in Table 4.

3.5.2. Climate zones and weather data

To enable comprehensive cross-climate comparison of spatial layout effects on building energy performance, this study selects five representative cities based on China’s building thermal design zoning standards. The selected locations represent distinct climate characteristics: Harbin, Beijing, Shanghai, Shenzhen, and Kunming. These cities correspond respectively to various climate zones according to Koppen ¨ climate classification, as detailed in Table 5. Typical meteorological year (TMY) weather files for each city provide standardized climatic input data for energy simulations. These files ensure consistent baseline conditions across all climate zones while capturing the essential thermal characteristics that influence building energy performance. The heating and cooling


Fig. 4. Architectural layout factors case floor plan diagram.


Fig. 5. Standard floor plan of the mixed-use office building.


(a) Energy consumption simulation model Apartment Office


(b) Schematic diagram of space zoning Meeting Canteen
Fig. 6. Performance simulation model and zoning diagram of a typical building automatically generated.

Table 3 Energy consumption simulation parameters for different space uses of the mixed-use office building.

Function NameArea (m2)Per capita possession Floor area (m2/person)Electrical equipment Power density (W/m2)Illumination power Density values (W/m2)
Office460810158
Meetings1280101212
Apartment384025156
Cafeteria153681312
Vertical transportation1024856

periods for each city are established based on local regulations and climatic conditions, with specific periods detailed in Table 5.

3.5.3. Detailed building energy simulation parameters and settings

Energy consumption simulations utilize Honeybee (interfacing with EnergyPlus) integrated within the Rhino 7/Grasshopper environment. Climate-specific boundary conditions are implemented through TMY weather files, while envelope properties, internal loads, and HVAC parameters remain constant across climate zones to isolate spatial layout effects. Building envelope materials and thermal properties are standardized across all simulations (exterior wall U-value: 0.45W/m2.0 . 4 5 \mathrm { W } / \mathrm { m } ^ { 2 } . K; roof U-value: 0.53W/m2K;0 . 5 3 W / \mathrm { { m } } ^ { 2 } { \cdot } \mathrm { { K } } ; windows U-value: 2.70W/m2K)2 . 7 0 \mathrm { W / m } ^ { 2 } { \cdot } \mathrm { K } ) . Multi-objective optimization employs the Octopus plugin with SPEA-2 and HypE algorithms, configured with 60 individuals population size, 10%1 0 \% elite probability, 80 %8 0 ~ \% crossover rate, 30%3 0 \% mutation probability, and 30 generations maximum iteration limit. Building envelope materials and thermal properties are standardized across all simulations to eliminate confounding

Table 4 Building energy performance simulation model setup.

ParameterValues
1Meteorological parametersHarbin; Beijing; Shanghai; Kunming; Shenzhen
2HolidaysChinese holidays
3Building floors8 floors above ground
4Standard floor aspect ratio3:2
5Building orientationDue south
6Total building area12288 m squared
7Building dimensions36m long × 24m wide × 28.8m high
8Floor height3.6 m
9East-west window-to-wall area ratio0.2
10South-to-north window-to-wall area ratio0.4

Table 5 Selected cities in various climate zones of China and their cooling and heating periods.

CityBuilding Thermal Design ZoneKöppen Climate ClassificationCooling PeriodHeating Period
HarbinSeverely Cold RegionsContinental monsoon climateJune 1 - August 31October 20 - April 20
BeijingCold regionsWarm temperate continental monsoon climateJune 1 - August 31November 15 - March 15
ShanghaiHot summers and cold wintersTemperate monsoon climateJune 1 - August 31November 15 - March 15
ShenzhenHot summers and warm wintersSubtropical monsoon climateJune 1 - September 31November 15 - March 15
KunmingMild RegionSubtropical plateau monsoon climateJune 1 - August 31November 15 - March 15

variables, with complete specifications provided in Table 6.

Operational schedules for personnel activity, electrical equipment usage, lighting systems, and HVAC operations are established according to space requirements and local practices, as comprehensively detailed in Appendix 3 (personnel activity and electrical equipment), Appendix 4 (lighting schedules), and Appendix 5 (heating and cooling schedules). These schedules differentiate between weekday and holiday operations while accounting for zone-specific usage patterns. Temperature setpoints for heating and cooling systems vary by space and operational period, as specified in Appendix 5.

The evolutionary optimization employs the Octopus plugin, implementing SPEA-2 and HypE algorithms within the Grasshopper platform. Evolutionary algorithm parameters are configured as follows: population size of 60 individuals, elite probability of 10 %1 0 ~ \% , crossover rate of 80%8 0 \% , mutation probability and mutation rate of 30%3 0 \% each, with a maximum iteration limit of 30 generations serving as the termination criterion. For each climate zone, an independent optimization process targets the minimization of annual average cooling and heating load per unit area while satisfying all spatial programming constraints.

4. Results and discussions

4.1. Optimization results of the layout of the building

Fig. 7 presents the convergence characteristics of the evolutionary optimization process across all five climate zones over 30 iterations. The vertical axis represents total annual cooling and heating loads (kWh/m2)\mathrm { ( k W h / m ^ { 2 } ) } ), while the horizontal axis shows iteration number. Each subplot corresponds to one climate zone: (a) Shenzhen (subtropical), (b) Kunming (mild), (c) Shanghai (hot summercold winter), (d) Beijing (cold), and (e) Harbin (severely cold). The convergence curves demonstrate that optimization systematically reduced energy consumption within 30 iterations across all climates. Shenzhen exhibited the highest absolute load values with the steepest reduction trajectory, decreasing from approximately 69kWh/m26 9 \mathrm { k W h / m ^ { 2 } } to 57kWh/m25 7 \mathrm { k W h / m ^ { 2 } } , which creates 17.25%1 7 . 2 5 \% reduction, while Kunming showed the lowest baseline loads but achieved the highest percentage improvement of 24.91%2 4 . 9 1 \% . Heating-dominated climates (such as Beijing, Harbin) displayed more gradual convergence patterns compared to cooling-dominated regions (such as Shenzhen), reflecting the differential complexity of spatial layout optimization under varying thermal conditions. These convergence characteristics confirm the computational efficiency and robustness of the evolutionary optimization framework across diverse climate contexts.

Table 6 Envelope material parameters of the mixed-use office building.

ConstructionMaterialsU value (W/m2·K)
Exterior wallPure gypsum board 10 mm + Extruded polystyrene board 60 mm + Pure gypsum board 8 mm + Heavy mortar clay 240 mm0.45
RoofBitumen mineral wool felt 25 mm + extruded polystyrene board 50 mm + bitumen mineral wool felt 30 mm0.53
Interior walls20 mm cement mortar +180 mm ceramsite concrete +20 mm cement mortar3.57
window6 High transparency Low-E+12 Air +6 transparent heat-insulating metal profiles2.70


Fig. 7. The iterative process of the five groups of experiments.

Fig. 8 illustrates the direct comparison between the worst-case scenarios and optimal spatial layout solutions discovered during the evolutionary optimization process, using Beijing as a representative example with typical floor plans shown. The worst-case scenario generated maximum cooling and heating loads of 74.89kWh/m27 4 . 8 9 \mathrm { k W h / m ^ { 2 } } , while the optimal layout achieved minimum energy consumption of

BeijingMax cooling and heating load solution
FLOOR1FLOOR2FLOOR3FLOOR4
FLOOR5FLOOR6FLOOR7FLOOR8
Generation8Individual2
Total loads kW·h/(m2)50.7Cooling load kW·h/(m2)24.0Heating load kW·h/(m2)26.7
Vertical TransportApartmentOfficeMeetingCanteen

(a)The maximal cooling and heating load layout of the Beijing group

Generation29Individual8
Total loads kW·h/(m2)45.8Cooling load kW·h/(m2)24.0Heating load kW·h/(m2)21.8

(b)The minimum cooling and heating load layout ofthe Beijing group

Fig. 8. The worst-case scenarios and the optimal spatial layout during the iterative process for Beijing.

67.68 kWh/m26 7 . 6 8 \mathrm { \ k W h / m ^ { 2 } } , which is 9.67 %9 . 6 7 ~ \% reduction. Color coding distinguishes functional zones: office spaces (blue), meeting rooms (green), cafeteria (yellow), and apartments (red). Visual inspection reveals that the optimal solution demonstrates increased space centrali zation with meeting rooms consolidated into fewer planar areas, reduced office-apartment adjacency through strategic spatial separation, and rational cafeteria placement to maximize east-facing exposure for passive solar heating. In contrast, the worst-case layout exhibits dispersed meeting room configurations, excessive office-apartment adjacent areas exceeding 400m24 0 0 \mathrm { m } ^ { 2 } per floor, and suboptimal orientation distribution. The visual differences shown here can help substantiate the quantitative energy performance gaps and provide architects with concrete examples distinguishing energy-efficient from energy-inefficient spatial arrangements. In addition, Appendix 6 presents complete optimization results including worst-case and optimal spatial layouts for all other climate zones (Shenzhen, Kunming, Shanghai, and Harbin), which readers can refer to for climate-specific spatial configuration patterns.

4.2. Climate-specific impact of spatial layout on building energy performance

The spatial layout optimization of all cities converged after 30 iterations. After the solutions that did not meet the requirements were excluded, more than 1700 historical solutions were generated in each group. The average annual cooling load per unit area and annual heating load per unit area of each group are presented in the form of bar graphs in Fig. 9, and Fig. 10 is a box graph of the sum of annual cooling load per unit area and heating load per unit area of all historical solutions of the five groups of results.

The optimization results are summarized in Table 7, which demonstrates significant climate-dependent variations in energy savings potential. Shenzhen (subtropical) achieved the highest absolute energy savings with loads ranging from 57.06 to 68.95 kWh/m26 8 . 9 5 \ \mathrm { k W h / m ^ { 2 } } , showing a maximum 17.25 %1 7 . 2 5 ~ \% reduction primarily from cooling load optimization. Kunming (mild climate) exhibited the largest percentage reduction (24.91 %( 2 4 . 9 1 \ \% ) but minimal absolute savings due to low baseline loads (4.015.33kWh/m2)( 4 . 0 1 { - } 5 . 3 3 \mathrm { k W h / m ^ { 2 } } ) ).

In heating-dominated climates, spatial layout optimization showed greater impact on heating versus cooling loads: Shanghai (hot summer/cold winter) achieved 13.6 %1 3 . 6 ~ \% total reduction with 29.38 %2 9 . 3 8 ~ \% heating load variation, Beijing (cold) reached 9.67 %9 . 6 7 ~ \% total reduction with heating load reductions (18.51 %)( 1 8 . 5 1 ~ \% ) ) exceeding cooling load reductions (9.52 %)( 9 . 5 2 ~ \% ) , and Harbin (severely cold) demonstrated 10.92%1 0 . 9 2 \% total reduction with heating loads showing 13.31 %1 3 . 3 1 \ \% variation range.

Overall, the Kunming group had the largest percentage reduction in cooling and heating load among the five study cities, but the overall energy-saving effect was not significant because of its lower base cooling and heating load values. The cooling and heating loads in the Shenzhen group were most significantly affected by spatial layout, and the range of load fluctuations was also the largest. The cooling and heating loads in Shanghai, Beijing and Harbin decreased by about 10%1 0 \% , and the difference in heating load accounted for a larger proportion. Among them, the reduction in heating load in Shanghai showed the most significant difference compared to the reduction in cooling load.

Under different climatic conditions, the influence effect of spatial layout on the cooling and heating load of buildings varies. In mild regions, such as Kunming, although the percentage reduction of cooling and heating load is the greatest, due to the low base load, the energy-saving effect is not obvious. In hot summer and warm winter regions like Shenzhen, spatial layout has the most significant impact on cooling and heat load, so special attention should be paid to spatial layout in design. In regions with high heating demand such as hot summer and cold winter (Shanghai), cold winter (Beijing), and cold winter (Harbin), spatial layout design has a greater impact on heating load than on cooling load.

4.3. Machine learning model validation

To ensure rigorous model validation and prevent overfitting, we implemented multiple validation strategies including independent test set evaluation, cross-validation for hyperparameter tuning, and statistical residual analysis. The sample dataset used for the


Fig. 9. Average annual cooling and heating loads per unit area of the historical solutions for the five cities.


Fig. 10. Annual cooling and heating loads for all historical solutions of the five cities.

Table 7 Simulation results of automatic optimization of cooling and heating loads for building spatial layout in five thermal zones.

Experiment GroupsHistorical Explanation NumbersLoad TypeMaximum kW·h/(m2)Minimum kW·h/(m2)Average kW·h/(m2)Median kW·h/(m2)Standard deviationMaximum decrease %
Shenzhen1707Cooling load and heating load68.9557.0659.5559.192.4317.25
Cooling load68.6756.8359.3258.972.4317.24
Heating load0.380.150.230.230.0361.71
Kunming1768Cooling load and heating load5.334.014.204.150.1924.91
Cooling load2.992.482.712.710.0617.20
Heating load2.611.311.491.430.1849.92
Shanghai1796Cooling load and heating load42.6436.8437.6537.470.7513.60
Cooling load30.5727.9228.4028.360.388.66
Heating load12.088.539.249.050.4529.38
Beijing1797Cooling load and heating load50.6545.7646.5846.220.819.67
Cooling load25.3723.2124.2024.130.218.52
Heating load26.6521.7222.3822.100.7118.51
Harbin1770Cooling load and heating load93.6283.4085.0584.571.5310.92
Cooling load11.079.9010.7410.760.1310.64
Heating load83.7172.5874.3173.791.6113.31
  • Maximum reduction == (maximum − minimum)/maximum

random forest regression model comprises building spatial layout features and cooling and heating load data from all historical optimization solutions. Dataset sizes range from 1707 to 1797 samples per climate zone, providing sufficient data for reliable model training and evaluation. Each dataset was randomly partitioned into training (80%)( 8 0 \% ) ) and independent test sets (20%)( 2 0 \% ) ) using stratified

Table 8 Sample grouping results for the five cities.

CitySample size of the training setTest set sample sizeNumber of features
Shenzhen136634130
Kunming141435430
Shanghai143736030
Beijing143835930
Harbin141635430

sampling to maintain representative distributions, as shown in Table 8. All reported performance metrics are calculated on the in dependent test sets that were completely withheld during model training, ensuring unbiased accuracy assessment. Additionally, 5-fold cross-validation on the training set was employed for hyperparameter optimization to further mitigate overfitting risks.

The sample dataset used for the random forest regression model is the building spatial layout features and cooling and heating load data corresponding to all historical solutions of the cooling and heating load optimization. The building layout features were used as influencing factors, and the sum of cooling and heating loads was used as the dependent variable for model training. The cooling and heating load data of the five groups were randomly sampled using dataset partitioning method and divided into the training set and the test set in an 8:2 ratio. The results of the sample grouping are shown in Table 8.

The 80%20%8 0 \% - 2 0 \% train-test split ratio was selected based on established machine learning practice and has been validated as effective and optimal for the involved algorithms and size of the dataset in this research [55,56]. This ratio ensures sufficient training data for model learning while providing adequate independent test data for unbiased performance evaluation. Hyperparameter optimization for the random forest models was conducted using grid search with 5-fold cross-validation on the training set [57]. The optimized hyperparameters included: number of trees (n_estimators) tested in the range [100, 200, 300, 500], maximum depth (max_depth) tested in the range [10, 20, 30, None], minimum samples to split (min_samples_split) tested in Refs. [2,5,10], and minimum samples per leaf (min_samples_leaf) tested in Refs. [1,2,4]. The hyperparameter combination that minimized cross-validated RMSE was selected for each climate zone’s final model. This optimization process can help ensure that the models achieved optimal predictive performance while preventing overfitting [58].

To verify whether the data structure of the grouped samples is consistent with the original total sample, this section statistically analyzes the distribution of the total samples of the five cities, as well as the cooling and heating load data in their respective training and test sets. These statistics are presented in the form of a bar chart in Table 9. By observing the chart, it demonstrates that the parameter distributions of the total sample, training set and test set of the five cities are roughly the same, and the numerical ranges are consistent. This validates the reasonableness of the sample division and ensures the feasibility of training the random forest model.

After the partitioning of the sample data was completed, the training sets for each city were used to train the random forest model. As shown in Fig. 11, the predicted values fit well in the data-intensive sections, but for climate with higher heating and cooling load values, there is a relatively larger deviation due to the sparse distribution of the data. Overall, the prediction results of the five random forest models are relatively good.

To further verify the accuracy of the five groups of random forest regression models, the trained random forest models were tested using the corresponding test set samples, with RMSE, NRMSE, R2\mathrm { R } ^ { 2 } , and MAPE as the performance evaluation indicators of the models. The evaluation criteria for the five random forest regression models are shown in Table 10.

To assess the statistical significance of model performance, we have conducted residual analysis including normality tests (Shapiro-Wilk test, p>0.05\mathsf { p } > 0 . 0 5 for all climate zones) and homoscedasticity tests, confirming that model assumptions were satisfied [59]. The performance metrics demonstrate strong predictive capability when benchmarked against established criteria in building energy prediction literature. Previous studies have established that NRMSE values below 10%1 0 \% indicate good model performance for building energy prediction, with values below 5%5 \% considered excellent [60,61]. Our models achieved NRMSE values ranging from 4.92%4 . 9 2 \% to 7.95%7 . 9 5 \% , with three of five climate zones (Kunming: 5.47%5 . 4 7 \% , Shanghai: 5.52%5 . 5 2 \% , Harbin: 4.92%4 . 9 2 \% ) demonstrating excellent performance and the remaining two (Shenzhen: 7.77%7 . 7 7 \% , Beijing: 7.96 %7 . 9 6 ~ \% ) showing good performance. The R2\mathrm { R } ^ { 2 } values ranging from 0.801 to 0.874 indicate strong model fit, exceeding the benchmark of R2>0.75\mathrm { R } ^ { 2 } > 0 . 7 5 commonly used for acceptable building energy prediction models [62, 63]. Specifically, the Shenzhen (R2=0.873)( \mathrm { R } ^ { 2 } = 0 . 8 7 3 ) and Harbin (R2=0.874)( \mathrm { R } ^ { 2 } = 0 . 8 7 4 ) ) models achieved particularly high explanatory power, while Beijing (R2=0.801)( \mathrm { R } ^ { 2 } = 0 . 8 0 1 ) ) showed the lowest but still acceptable performance.

The MAPE values (0.236 %0.792 %)( 0 . 2 3 6 \ \% - 0 . 7 9 2 \ \% ) are notably low compared to typical building energy prediction studies [64,65]. Our results outperform these benchmarks, with all climate zones achieving MAPE well below 1%1 \% . Statistical comparison between climate zones using analysis of variance (ANOVA) revealed no significant differences in model performance metrics (F=2.31( \mathrm { F } = 2 . 3 1 , p=0.089\mathbf { p } = 0 . 0 8 9 for R2\mathrm { R } ^ { 2 } comparison), indicating consistent model quality across all climate zones. The smallest RMSE value appeared in the Kunming group (0.072kWh/m2)( 0 . 0 7 2 \mathrm { k W h } / \mathrm { m } ^ { 2 } ) , which is expected given the lower baseline energy consumption in this mild climate zone. Similarly, Harbin achieved the lowest MAPE (0.236%)( 0 . 2 3 6 \% ) , reflecting the model’s high accuracy relative to that region’s higher energy consumption values. Overall, the five groups of random forest regression models demonstrated statistically validated and literature-benchmarked strong performance in predicting building cooling and heating loads using building spatial layout features, with consistent accuracy across different climate zones.

Moreover, we selected Random Forest over alternative machine learning methods including Artificial Neural Networks (ANN) for several methodologically justified reasons that align with our research objectives and dataset characteristics. First, randome forest (RF) demonstrates superior resistance to overfitting for datasets of our size (approximately 1400–1800 samples per climate zone), whereas ANN typically requires substantially larger training datasets (tens of thousands of samples) to achieve stable generalization perfor mance and avoid overfitting [66,67]. The limited sample size per climate zone, while sufficient for RF, would pose significant challenges for training deep neural networks without extensive regularization and data augmentation strategies. RF requires minimal hyperparameter tuning with robust default parameters, while ANN demands extensive architecture design decisions and computationally expensive training procedures involving learning rate scheduling, batch size optimization, dropout rate selection, and activation function choices [68]. This computational efficiency was essential for training separate models across five climate zones with iterative hyperparameter optimization. Also, RF can naturally handle the mixed feature types in our dataset (i.e. continuous spatial measurements such as facing areas and discrete counts such as number of planar areas) without extensive preprocessing, whereas ANN requires careful feature normalization, scaling, and encoding that can introduce additional sources of error. Last but the most important, for our research objectives, RF provides direct feature importance measures that seamlessly integrate with SHAP analysis

Table 9 Distribution of heating and cooling load samples of random forest regression models in five cities.

Total sample Cooling and heating loadsTraining set Cooling and heating load distributionTest set Cooling and heating load distribution
Shenzhen Kunming Shanghai Beijing HarbinNumber of samples 0 58 60 62 64 66 68 cooling and heating loadsNumber of samples 0 58 60 62 64 66 68 cooling and heating loadsNumber of samples 0 58 60 62 64 66 68 cooling and heating loads
Number of samples 0 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300 310 320 330 340 350 360 370 380 390 400 410 420 430 440 450 460 470 480 490 500 510 520 530 540 550 560 570 580 590 600 610 620 630 640 650 660 670 680 690 700 710 720 730 740 750 760 770 780 790 800 810 820 830 840 850 860 870 880 890 900 910 920 930 940 950 960 970 980 990 1000 1010 1020 1030 1040 1050 1060 1070 1080 1090 1100 1110 1120 1130 1140 1150 1160 1170 1180 1190 1200 1210 1220 1230 1240 1250 1260 1270 1280 1290 1300 1310 1320 1330 1340 1350 1360 1370 1380 1390 1400 1410 1420 1430 1440 1450 1460 1470 1480 1490 1500 1510 1520 1530 1540 1550 1560 1570 1580 1590 1600 1610 1620 1630 1640 1650 1660 1670 1680 1690 1700 1710 1720 1730 1740 1750 1760 1770 1780 1790 1800 1810 1820 1830 1840 1850 1860 1870 1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 2030 2040 2050 2060 2070 2080 2090 2100 2110 2120 2130 2140 2150 2160 2170 2180 2190 2200 2210 2220 2230 2240 2250 2260 2270 2280 2290 2300 2310 2320 2330 2340 2350 2360 2370 2380 2390 2400 2410 2420 2430 2440 2450 2460 2470 2480 2490 2500 2510 2520 2530 2540 2550 2560 2570 2580 2590 2600 2610 2620 2630 2640 2650 2660 2670 2680 2690 2700 2710 2720 2730 2740 2750 2760 2770 2780 2790 2800 2810 2820 2830 2840 2850 2860 2870 2880 2890 2900 2910 2920 2930 2940 2950 2960 2970 2980 2990 3000 3010 3020 3030 3040 3050 3060 3070 3080 3090 3100 3110 3120 3130 3140 3150 3160 3170 3180 3190 3200 3210 3220 3230 3240 3250 3260 3270 3280 3290 3300 3310 3320 3330 3340 3350 3360 3370 3380 3390 3400 3410 3420 3430 3440 3450 3460 3470 3480 3490 3500 3510 3520 3530 3540 3550 3560 3570 3580 3590 3600 3610 3620 3630 3640 3650 3660 3670 3680 3690 3700 3710 3720 3730 3740 3750 3760 3770 3780 3790 3800 3810 3820 3830 3840 3850 3860 3870 3880 3890 3900 3910 3920 3930 3940 3950 3960 3970 3980 3990 4000 4010 4020 4030 4040 4050 4060 4070 4080 4090 4100 4110 4120 4130 4140 4150 4160 4170 4180 4190 4200 4210 4220 4230 4240 4250 4260 4270 4280 4290 4300 4310 4320 4330 4340 4350 4360 4370 4380 4390 4400 4410 4420 4430 4440 4450 4460 4470 4480 4490 4500 4510 4520 4530 4540 4550 4560 4570 4580 4590 4600 4610 4620 4630 4640 4650 4660 4670 4680 4690 4700 4710 4720 4730 4740 4750 4760 4770 4780 4790 4800 4810 4820 4830 4840 4850 4860 4870 4880 4890 4900 4910 4920 4930 4940 4950 4960 4970 4980 4990 5000 5010 5020 5030 5040 5050 5060 5070 5080 5090 5100 5110 5120 5130 5140 5150 5160 5170 5180 5190 5200 5210 5220 5230 5240 5250 5260 5270 5280 5290 5300 5310 5320 5330 5340 5350 5360 5370 5380 5390 5400 5410 5420 5430 5440 5450 5460 5470 5480 5490 5500 5510 5520 5530 5540 5550 5560 5570 5580 5590 5600 5610 5620 5630 5640 5650 5660 5670 5680 5690 5700 5710 5720 5730 5740 5750 5760 5770 5780 5790 5800 5810 5820 5830 5840 5850 5860 5870 5880 5890 5900 5910 5920 5930 5940 5950 5960 5970 5980 5990 6000 6010 6020 6030 6040 6050 6060 6070 6080 6090 6100 6110 6120 6130 6140 6150 6160 6170 6180 6190 6200 6210 6220 6230 6240 6250 6260 6270 6280 6290 6300 6310 6320 6330 6340 6350 6360 6370 6380 6390 6400 6410 6420 6430 6440 6450 6460 6470 6480 6490 6500 6510 6520 6530 6540 6550 6560 6570 6580 6590 6600 6610 6620 6630 6640 6650 6660 6670 6680 6690 6700 6710 6720 6730 6740 6750 6760 6770 6780 6790 6800 6810 6820 6830 6840 6850 6860 6870 6880 6890 6900 6910 6920 6930 6940 6950 6960 6970 6980 6990 7000 7010 7020 7030 7040 7050 7060 7070 7080 7090 7100 7110 7120 7130 7140 7150 7160 7170 7180 7190 7200 7210 7220 7230 7240 7250 7260 7270 7280 7290 7300 7310 7320 7330 7340 7350 7360 7370 7380 7390 7400 7410 7420 7430 7440 7450 7460 7470 7480 7490 7500 7510 7520 7530 7540 7550 7560 7570 7580 7590 7600 7610 7620 7630 7640 7650 7660 7670 7680 7690 7700 7710 7720 7730 7740 7750 7760 7770 7780 7790 7800 7810 7820 7830 7840 7850 7860 7870 7880 7890 7900 7910 7920 7930 7940 7950 7960 7970 7980 7990 8000 8010 8020 8030 8040 8050 8060 8070 8080 8090 8100 8110 8120 8130 8140 8150 8160 8170 8180 8190 8200 8210 8220 8230 8240 8250 8260 8270 8280 8290 8300 8310 8320 8330 8340 8350 8360 8370 8380 8390 8400 8410 8420 8430 8440 8450 8460 8470 8480 8490 8500 8510 8520 8530 8540 8550 8560 8570 8580 8590 8600 8610 8620 8630 8640 8650 8660 8670 8680 8690 8700 8710 8720 8730 8740 8750 8760 8770 8780 8790 8800 8810 8820 8830 8840 8850 8860 8870 8880 8890 8900 8910 8920 8930 8940 8950 8960 8970 8980 8990 9000


(a) Shenzhen


(b) Kunming


(c) Shanghai


(d) Beijing


(e)Harbin
Fig. 11. Prediction situations of five groups of random forest models.

Table 10
Performance evaluation metrics for five groups of random forest regression models.

CityRMSENRMSE (%)R2MAPE (%)
Shenzhen0.9247.7700.8730.792
Kunming0.0725.4710.8580.751
Shanghai0.3205.5200.8220.343
Beijing0.3897.9550.8010.323
Harbin0.5034.9180.8740.236

for interpretability, while ANN’s deep architectures present significant challenges for explainability even with advanced techniques [69]. Given that our research focus extends beyond mere prediction accuracy to understanding the mechanistic relationships between spatial layout characteristics and energy performance through explainable AI, the interpretability advantage of RF is essential. The


(a) Shenzhen


(b) Kunming


(c) Shanghai


(d) Beijing
(e)Harbin
Fig. 12. Global analysis of SHAP values for each group.


Fig. 13. Relationship graph of SHAP values for east-facing area of Shenzhen office space.

combination of competitive prediction accuracy (R2>0.80)( \mathrm { R } ^ { 2 } > 0 . 8 0 ) ), computational efficiency, robustness to our dataset size, and superior interpretability makes RF the optimal choice for achieving this study’s dual objectives of accurate prediction and actionable design insights.

4.4. Interpretability analysis using explainable AI

The SHAP values are used to interpret five groups of random forest regression models, and global and local driver analyses will be conducted on the random forest models of the five cities, respectively. Global interpretation, which describes the expected behavior of a machine learning model for the entire distribution of its input variable values, is achieved in SHAP by integrating the SHAP values of all sample instances. Global interpretation can effectively reveal the relative importance of each influencing feature, as well as their actual relationship to the predicted results. Local interpretation, on the other hand, is an analysis of predictions for specific instances, explaining how individual predictions are obtained based on the contribution of each model input variable. This helps us to analyze the extent to which influencing features affect the prediction results through local examples. With the help of the methodological framework, explore the importance of building spatial layout features in forecasting building cooling and heating loads under different climatic conditions, as well as the differences in their effects and mechanisms of influence.

Among the results of Shenzhen, the east-facing area of office space (OE), the east-facing area of apartment space (AE), and the westfacing area of office space (OW) had the most significant effects on cooling and heating loads and had higher mean absolute SHAP values(Fig. 12(a)). The pooled graph in Fig. 13 further reveals the nonlinear relationship between the eigenvalues and the predicted cooling and heating loads through a scatter distribution. The larger east-facing area (OE) of office spaces corresponds to a positive SHAP value, indicating that arranging more east-facing spaces significantly increases the building’s cooling and heating load. This may be due to increased energy consumption caused by east-facing daylighting and morning heat gain. While a smaller office area on the east side can reduce the cooling and heating load, the impact is relatively small, showing an asymmetry in the effect. Fig. 14 validates the above trend through a partial case study combined with Fig. 13. In the scheme with the highest load, the 432m24 3 2 \mathrm m ^ { 2 } office east-facing area leads to a significant increase in energy consumption. This suggests that optimizing the orientation and area distribution of space, such as placing more cafeteria and apartment spaces on the east side, is an effective strategy for reducing energy consumption.

In subtropical climates, the orientations of buildings have a particularly crucial impact on energy consumption. An east orientation


Fig. 14. Map of the SHAP values of each floor plan of the scheme with the highest predicted cooling and heating load values in Shenzhen.


(a)Shanghai apartment space and office space planar adjacent area SHAP value analysis graph


(b) Analysis of the SHAP value relationship of the number of floor areas ofthe cafeteria space in Shanghai


(c) SHAP values of the space number for the Shanghai


(d)Analysis graph ofthe SHAP value relationship between the planar adjacent areas of the apartment space and the conference space in Shanghai


Fig. 15. Global analysis of SHAP values for Shanghai.
Fig. 16. Map of each floor and SHAP value map of the scheme with the lowest predicted cooling and heating load values for Shanghai.

leads to morning heat gain and improved daylighting, thereby increasing the building’s cooling and heating loads. When optimizing the spatial layout, the rational distribution of these orientations, especially placing more dining and apartment spaces on the east side, helps to reduce energy consumption. In subtropical climate, radiative heat gain and heat conduction from the building’s outer surface are the main influencing factors. An increase in the area facing east will increase the heat load in the morning, while optimizing the orientation distribution can reduce energy consumption by reducing the solar radiation heat load.


(a)Analysis graph of SHAP values for planar adjacent areas of apartment and office spaces in Beijing


(b) Diagram of the relationship between SHAP values of the east orientation area of the cafeteria space in Beijing
Fig. 17. Global analysis of SHAP values for Beijing.

The results of Kunming that is indicated in Fig. 12(b) show that a larger east-facing area (OE) of office spaces can significantly reduce the cooling and heating load, especially in the mild climate of Kunming, where the east-facing arrangement of office spaces plays an important energy-saving role. However, the effects of other features were smaller, reflecting the relatively stable energy consumption characteristics in the Kunming area. The climate in Kunming is relatively mild, with moderate temperatures in summer and colder winters. Increasing the area facing east can enhance the collection of sunlight in winter, effectively improve daylighting and natural heating, thereby reducing the demand for heating. The overall energy consumption in the Kunming area is relatively low, so the impact of energy-saving design is relatively weak.

In the analysis of the Shanghai group, the planar adjacent area (O-A) of office and apartment spaces had the most significant impact on cooling and heating loads (Fig. 12(c)). Larger adjacent areas increase energy consumption, while smaller adjacent areas help reduce the load. When the number of cafeteria spaces is small, cooling and heating loads can be effectively reduced, while a dispersed layout will significantly increase energy consumption. The local analysis results, which are visualized in Figs. 15 and 16 further validate the global trends: in the scheme with the lowest cooling and heating load, the cafeteria space is divided into only two areas CC=2)\mathrm { C C } = 2 ) ), the apartment and meeting spaces are closely arranged (AM=576)( \mathbf { A } { \cdot } \mathbf { M } = 5 7 6 ) , and the office and apartment spaces are adjacent to A smaller area (O-A=259.2)\mathbf { A } = 2 5 9 . 2 ) ), all contributing to the energy-saving effect. Shanghai has a hot summer and cold winter climate. High temperatures in summer increase the air conditioning load, while cold winters increase the heating load. By reducing the area adjacent to office and apartment spaces in the design, the efficiency of air heat exchange can be reduced, thereby reducing the cooling and heating load. At the same time, the centralized cafeteria space helps to optimize the spatial layout and reduce unnecessary energy consumption.

For Beijing, the planar proximity area (O-A) of the office and apartment spaces had the greatest impact on the cooling and heating load(Fig. 12(d)). In particular, when O-A is greater than 100m21 0 0 \mathrm { m } ^ { 2 } (Fig. 17), energy consumption increases significantly. Meanwhile, the east-facing area (CE) of the cafeteria space has a significant negative impact on load forecasting, and a larger east-facing area can effectively reduce the cooling and heating load. The cold climate in Beijing means a greater demand for heating in winter, and increasing the east-facing area of the cafeteria can effectively utilize sunlight to reduce the heating load. At the same time, reducing the area adjacent to office and apartment spaces can reduce indoor heat exchange, optimize the thermal environment of the building and lower energy consumption.

For Harbin, the number of planar regions (MC) of the conference space had the most significant effect on cooling and heating loads (Fig. 12(e)). A smaller number of meeting areas helped to reduce the cooling and heating load. When the floor area adjacent to the office and apartment spaces (O-M) is larger, the load forecast decreases; otherwise, it increases. The extremely cold climate in Harbin means extremely high heating load in winter. Under such climatic conditions, the centralized arrangement of conference spaces can reduce the building’s voids and heat loss, thereby reducing the load. Reasonable adjacent arrangement of office and apartment spaces helps to reduce energy consumption and avoid unnecessary heat exchange and temperature fluctuations.

By combining the SHAP analysis results of five cities, the following key strategies for energy conservation in building spatial layouts can be identified: (1) Climate-adaptive design: The orientation of buildings, the spatial layouts, and the configuration of adjacent areas should vary under different climatic conditions. In subtropical climates (Shenzhen), eastward orientation and reduced westward lighting are key to energy conservation, while in cold climates (such as Harbin), the centralized arrangement of conference spaces and reasonable spatial layout can significantly reduce energy load. (2) Space layout distribution and layout optimization: In hot summer and cold winter or temperate climate, the rational allocation of the orientation and adjacent area of spaces, especially the layout of canteens, offices and meeting spaces, will play a significant role in energy conservation. Avoiding excessive adjacent areas and scattered layouts is an important means to reduce the building’s cooling and heating load. (3) Heat load regulation and control: By optimizing the heat load on the exterior surface of the building and rationally designing the orientation, window surfaces and spatial distribution of the building, the heat exchange between the interior and exterior of the building can be effectively regulated, the energy efficiency of the building can be improved, and the heat load can be reduced.

4.5. Practical design guidelines for climate-adaptive spatial layout

The findings from optimization and SHAP analysis can be translated into actionable design strategies for practitioners working in different climate zones. This section provides specific guidelines for applying the identified principles during early-stage architectural design:

For subtropical climates (represented by Shenzhen): The dominant factor affecting energy performance is orientation area distribution, particularly east-facing exposure. Architects should minimize office spaces on east-facing facades where morning solar heat gain significantly increases cooling loads. Instead, allocate dining halls, circulation spaces, or service areas to east orientations, as these have lower sensitivity to solar heat gain. West-facing areas should also be minimized for high-occupancy spaces. For a typical rectangular building, this translates to: (1) positioning primary office zones on north and south facades where window-to-wall ratios can be optimized for daylighting without excessive heat gain, (2) locating cafeterias and meeting rooms on east and west sides where intermittent occupancy patterns better tolerate thermal fluctuations, and (3) using serviced apartments (with lower daytime occupancy) as thermal buffers on problematic orientations.

For heating-dominated climates (Beijing, Harbin, Shanghai): Space centralization and adjacency optimization become critical. Architects should consolidate spaces with similar thermal requirements to reduce heat loss through internal partitions. Practical strategies include: (1) concentrating meeting rooms in fewer, larger zones rather than dispersing them across multiple floors (reducing the number of planar areas), (2) minimizing adjacent areas between office and apartment spaces by introducing buffer zones or separating these functions vertically, (3) locating cafeterias to maximize east-facing exposure for passive solar heating during winter while maintaining centralized configurations. For Shanghai specifically, reducing office-apartment adjacency from >400 m2{ > } 4 0 0 ~ \mathrm { m } ^ { 2 } to <260< 2 6 0 m2\mathrm { m } ^ { 2 } per floor can achieve substantial energy savings.

For mild climates (Kunming): Although absolute energy savings are modest, orientation optimization remains beneficial. Eastfacing office areas should be maximized to capture morning sunlight for winter heating while avoiding overheating during mild weather. The relatively low energy intensity in mild climates provides design flexibility, allowing greater emphasis on other performance criteria such as daylighting and spatial quality.

Sensitivity analysis across climate zones reveals that the relative importance of spatial variables shifts with climate intensity. In extreme climates (severely cold or hot-humid), spatial layout decisions have amplified impact—errors in space allocation result in proportionally greater energy penalties. Conversely, mild climates exhibit greater tolerance to layout variations. This climatedependent sensitivity suggests that computational optimization investment yields highest returns in extreme climate zones where design precision matters most. During schematic design, architects can apply these guidelines by: (1) conducting preliminary zoning studies that test orientation distribution and space concentration patterns based on climate zone, (2) using the 26 quantified spatial features indicated in Table 2 as evaluation metrics to assess alternative layouts, (3) prioritizing the top-ranked SHAP features identified for their specific climate zone during iterative refinement, and (4) validating final schemes through simplified energy simulation focusing on cooling and heating loads. The automated generation method developed in this research can be adapted as a design exploration tool, with architects adjusting the priority weights of different spatial features based on climate-specific importance rankings revealed by SHAP analysis.

5. Conclusion

This study investigated the impact of spatial layout optimization on energy consumption in medium-rise office buildings across five Chinese cities with diverse climatic conditions. Using building spatial layout generation tools, evolutionary algorithms, and machine learning methods including random forest and SHAP analysis, we examined how different building layouts affect cooling and heating loads in various thermal zones. The research demonstrates that spatial layout plays a crucial role in energy-efficient building design, with the proposed optimization tool achieving approximately 10 %1 0 ~ \% energy savings in cooling and heating loads. The most significant results were observed in Shenzhen’s subtropical climate, where energy savings reached 17.25%1 7 . 2 5 \% , highlighting the substantial potential for climate-adaptive design strategies.

Climate-specific patterns emerged from this study. In mild climates like Kunming, while percentage reductions in loads were notable, overall energy savings remained modest due to lower baseline consumption. Subtropical regions (Shenzhen) showed the most dramatic load fluctuations, indicating that spatial optimization is particularly critical in such climates. In heating-dominated regions including Shanghai, Beijing, and Harbin, spatial layout optimization primarily influenced heating loads, with substantial reductions achieved through strategic planning. Through quantitative analysis of layout concentration, dispersion, orientation, and spatial proximity, several climate-specific strategies were identified.

• Subtropical climates: East-facing orientations and reduced west-facing windows are essential for energy conservation
• Cold climates: Concentrating similar spaces and implementing rational spatial arrangements significantly reduce energy loads
• Temperate and hot summer/cold winter climates: Strategic space orientation and adjacent area allocation provide substantial energy benefits, while avoiding excessive adjacencies and dispersed layouts reduces cooling and heating demands

This automation-based spatial optimization methodology provides a widely adaptable framework for energy-efficient design of medium-rise office buildings across different climatic conditions. The quantitative analysis mechanisms offer targeted, climate-specific strategies that advance computational green building design practices. However, this study’s scope is limited to one building type and five Chinese climate zones, potentially restricting global applicability. The geometric constraints of the spatial generation algorithm

may not capture all design variations, and the focus on cooling and heating loads excludes other energy considerations like lighting and equipment optimization. Future research should expand to include diverse building types, international climate regions, comprehensive energy requirements, and integration with renewable energy systems to develop more holistic energy-efficient design solutions.

CRediT authorship contribution statement

Peiying Huang: Writing – original draft, Visualization, Validation, Software, Methodology, Investigation, Conceptualization. Yanxiang Yang: Writing – original draft, Validation, Investigation, Data curation. Wen Gao: Writing – review & editing, Supervision, Resources, Methodology, Formal analysis. Xing Zheng: Writing – review & editing, Supervision, Resources, Project administration, Methodology. Pengyuan Shen: Writing – review & editing, Writing – original draft, Supervision, Resources, Project administration, Methodology, Conceptualization.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

This research is supported by Shenzhen Fundamental Research Program JCYJ20250604180231041.

Appendix

Appendix 1 3D spatial layout generation algorithm pseudocode

Algorithm 1. 3D Space Layout Generation Algorithm

Input:program, gene_data, info_data Output:result(program
1: Generate the layout for vertical circulation.
2: vt_plan == creat.transport計劃(program, gene_data, info_data)
3: Execute vertical growth of the vertical circulation.
4: program == creat_vertical_mass(program, vt_plan)
5: Perform synchronous growth for multiple connected regions.
6: result_program == grow(program (program, gene_data, info_data)
7: If any spatial units remain unfilled, continue growth until all units are filled.
8: if len(program.get_attribute_unit_seq(0)) >0>0 do
9: result(program == fill(program(result(program, gene_data, info_data)
10: end

Appendix 2 Iterative growth algorithm pseudocode

Algorithm 2. grow_program

Input:program, gene_data, info_data Output:program
1:Set the starting point according to the gene_data.
2:program = set_start_point(program, gene_data.start_unit)
3:Grow the zones according to the gene_data until all zones have completed growth.
4:while bool(finish_check) = True do
5:stop_check = [True] * infozone_count
6:Grow according to the direction sequence in the gene_data.

(continued on next page)

(continued )

Input:program, gene_data, info_data Output:program
1:Set the starting point according to the gene_data.
2:program = set_start_point(program, gene_data.start_unit)
3:Grow the zones according to the gene_data until all zones have completed growth.
4:while bool(finish_check) = True do
5:stop_check = [True] * infozone_count
6:Grow according to the direction sequence in the gene_data.
7:for dir in gene_datairection do
8:for zone in info_data-zone do
9:Grow according to the step length in the gene_data.
10:for step in gene_data.Step_len[zone][dir] do
11:Iterate through the spatial units in this zone.
12:for unit in program[zone] do
13:If the neighboring spatial unit in the growth direction is vacant, grow into that unit
as part of the zone.
14:grow_unit = get_neighbor_unit (program, unit, dir)
15:if grow_unit.Attri == 0 do
16:program[grow_unit].attri = zone
17:stop_check[zone] = False
18:Check if the zone meets the required area; if so, mark the zone as completed.
19:if info_data.area_demand[zone](1-info_data.area_tolerance)≤cal_area(zone)≤info_data.area_demand[zone](1+info_data.area_tolerance) do
20:finish_check[zone] = True
7:for dir in gene_datairection do
8:for zone in info_data.zone do
9:Grow according to the step length in the gene_data.
10:for step in gene_data.Step_len[zone][dir] do
11:Iterate through the spatial units in this zone.
12:for unit in program[zone] do
13:If the neighboring spatial unit in the growth direction is vacant, grow into that unit
as partof the zone.
14:grow_unit = get_neighbor_unit (program, unit, dir)
15:if grow_unit.Attri == 0 do
16:program[grow_unit].attri = zone
17:stop_check[zone] = False
18:Check if the zone meets the required area; if so, mark the zone as completed.
19 :if info_data.area_demand[zone](1-info_data.area_tolerance)≤cal_area(zone)≤info_data.areaDemand[zone](1+info_data.area_tolerance) do
20:finish_check[zone] = True
21:If a zone cannot continue growing and has not met the area requirement, select a vacant unit as the new starting point.
22:for zone in info_data.zone do
23:if finish_check[zone] == False and stop_check[zone] == Ture do
24:new_start_unit = get_vacant_unit(program)
25:program [new_start_unit].attri = zone
26end

Appendix 3 Schedule of personnel activity and electrical equipment usage for each spatial plan in the cooling and heating load optimization of typical medium-rise office building under different building thermal zones

Time
Area123456789101112
OfficeWeekdays000000105095959580
Holidays000000000000
MeetingsWeekdays0000000505050500
Holidays000000000000
CafeteriaWeekdays00000002020202090
Holidays000000000000
Serviced apartmentAll year round707070707070707050505050
TrafficAll year round000000107070707070
Time
Area131415161718192021222324
OfficeWeekdays8095959595303000000
Holidays000000000000
MeetingsWeekdays050505050505000000
Holidays000000000000
CafeteriaWeekdays9090202020202000000
Holidays000000000000
Serviced apartmentAll year round505050505050707070707070
TransportationAll year round707070707070707050000

Appendix 4 Lighting schedule for each spatial plan in the cooling and heating load optimization of typical medium-rise office building under different building thermal zones

Time
Area1234567891011
OfficeWeekdays0000001050959580
Holidays00000000000
MeetingsWeekdays00000005050500
Holidays00000000000
CafeteriaWeekdays000000020202090
Holidays00000000000
Serviced apartmentAll year round1010101010103030303030
TransportationAll year round1010101010103030303030
Time
Area1314151617181920212223
OfficeWeekdays809595959530300000
Holidays00000000000
MeetingsWeekdays05050505050500000
Holidays00000000000
CafeteriaWeekdays909020202020200000
Holidays00000000000
Serviced apartmentAll year round3030505060909090908010
TrafficAll year round3030303050505050501010

Appendix 5 Schedule of heating and cooling for each spatial plan in the cooling and heating load optimization of typical medium-rise office building under different building thermal

Time
Area123456789101112
OfficeWeekdaysair conditioner-----282626262626
Heating555512182020202020
Holidaysair conditioner-----------
Heating55555555555
CafeteriaWeekdaysair conditioner---------2626
Heating5555555551818
Holidaysair conditioner-----------
Heating55555555555
Serviced apartmentYear roundair conditioner2626262626262626262626
Heating2222222222222222222222
TransportationYear roundair conditioner-----282626262626
Heating555512182020202020
Time
Area131415161718192021222324
OfficeWeekdaysair conditioner2626262626------
Heating202020202018125555
Holidaysair conditioner-----------
Heating55555555555
CafeteriaWeekdaysair conditioner2626---------
Heating1818555555555
Holidaysair conditioner-----------
Heating55555555555
Serviced apartmentYear roundair conditioner262626262626262626
Heating222222222222222222
TransportationYear roundair conditioner2626262626------
Heating202020202018125555

Appendix 6. The worst-case scenarios and the optimal spatial layout during the iterative process in Shenzhen, Kunming, Shanghai, and Harbin


Shenzhen
Max cooling and heating load solution
FLOOR1


FLOOR2


FLOOR3


FLOOR4


FLOOR5


FLOOR6


FLOOR7


FLOOR8

Generation1Individual4
Total loads kW·h/(m2)69.0Cooling load kW·h/(m2)68.7Heating load kW·h/(m2)0.3


Shenzhen

Min cooling and heating load solution


FLOOR1


FLOOR2


FLOOR3


FLOOR4


FLOOR5


FLOOR6


FLOOR7


FLOOR8

Generation12Individual17
Total loads kW·h/(m2)57.1Cooling load kW·h/(m2)56.9Heating load kW·h/(m2)0.2
Vertical TransportApartmentOfficeMeetingCanteen

Kunming

Max cooling and heating load solution


FLOOR1


FLOOR2


FLOOR3


FLOOR4


FLOOR5


FLOOR6


FLOOR7


FLOOR8

Generation1Individual9
Total loads kW·h/(m2)5.3Cooling load kW·h/(m2)2.8Heating load kW·h/(m2)2.6

Kunming

Min cooling and heating load solution


FLOOR1


FLOOR2


FLOOR3


FLOOR4


FLOOR5


FLOOR6


FLOOR7


FLOOR8

Generation15Individual24
Total loads kW·h/(m2)4.0Cooling load kW·h/(m2)2.7Heating load kW·h/(m2)1.3


Shanghai
Max cooling and heating load solution
FLOOR1


FLOOR2


FLOOR3


FLOOR4


FLOOR5


FLOOR6


FLOOR7


FLOOR8

Generation1Individual15
Total loads \( {kW} \cdot h/\left( {m}^{2}\right) \)42.6Cooling load \( {kW} \cdot h/\left( {m}^{2}\right) \)30.6Heating load \( {kW} \cdot h/\left( {m}^{2}\right) \)12.1


Vertical
Transport


Apartment


Office


Meeting


Canteen


Shanghai
Min cooling and heating load solution
FLOOR1


FLOOR2


FLOOR3


FLOOR4


FLOOR5


FLOOR6


FLOOR7


FLOOR8

Generation22Individual23
Total loads kW·h/(m2)36.8Cooling load kW·h/(m2)27.9Heating load kW·h/(m2)8.9


Vertical


Apartment


Office


Meeting


Canteen


Harbin
Max cooling and heating load solution
FLOOR1


FLOOR2


FLOOR3


FLOOR4


FLOOR5


FLOOR6


FLOOR7


FLOOR8

Generation1Individual52
Total loads kW·h/(m2)93.6Cooling load kW·h/(m2)9.9Heating load kW·h/(m2)83.7


Harbin
Min cooling and heating load solution
FLOOR1


FLOOR2


FLOOR3


FLOOR4


FLOOR5


FLOOR6


FLOOR7


FLOOR8

Generation19Individual25
Total loads kW·h/(m2)83.4Cooling load kW·h/(m2)10.8Heating load kW·h/(m2)72.6

Data availability

Data will be made available on request.

References

[1] United Nations Environment P, Global Alliance for B, Construction. Not Just Another Brick in the Wall: the Solutions Exist - Scaling them will Build on Progress and Cut Emissions Fast. Global Status Report for Buildings and Construction 2024/2025, United Nations Environment Programme, 2025.
[2] IPCC, in: H. Lee, J. Romero (Eds.), Climate Change 2023: Synthesis Report, 2023.
[3] T.L. Hemsath, Conceptual energy modeling for architecture, planning and design: impact of using building performance simulation in early design stages, in: 3th Conference of International Building Performance Simulation Association, August 26-28. 2013. Chamb´ery, France.
[4] F. Kheiri, A review on optimization methods applied in energy-efficient building geometry and envelope design, Renew. Sustain. Energy Rev. 92 (2018) 897–920.
[5] S. Li, M. Wang, P. Shen, X. Cui, L. Bu, R. Wei, et al., Energy saving and thermal comfort performance of passive retrofitting measures for traditional rammed Earth house in Lingnan, China, Buildings 12 (2022) 1716.
[6] P. Shen, Y. Li, X. Gao, S. Chen, X. Cui, Y. Zhang, et al., Climate adaptability of building passive strategies to changing future urban climate: a review, Nexus 2 (2025) 1–13.
[7] P. Shen, Z. Wang, Y. Ji, Exploring potential for residential energy saving in New York using developed lightweight prototypical building models based on survey data in the past decades, Sustain. Cities Soc. 66 (2021) 102659.
[8] S. Himmetoglu, ˘ Y. Delice, E. Kızılkaya Aydogan, ˘ B. Uzal, Green building envelope designs in different climate and seismic zones: multi-objective ANN-based genetic algorithm, Sustain. Energy Technol. Assessments 53 (2022) 102505.
[9] S. Himmetoglu, ˘ Y. Delice, E.K. Aydogan, ˘ PSACONN mining algorithm for multi-factor thermal energy-efficient public building design, J. Build. Eng. 34 (2021) 102020.

[10] N. Es-sakali, J. Pfafferott, M.O. Mghazli, M. Cherkaoui, Towards climate-responsive net zero energy rural schools: a multi-objective passive design optimization with bio-based insulations, shading, and roof vegetation, Sustain. Cities Soc. 120 (2025) 106142.
[11] D. Sun, Y. Zheng, R. Duan, Energy consumption simulation and economic benefit analysis for urban electric commercial-vehicles, Transport. Res. Transport Environ. 101 (2021) 103083.
[12] W. Huang, H. Zheng, Architectural drawings recognition and generation through machine learning, in: Proceedings of the 38th Annual Conference of the Association for Computer Aided Design in Architecture, 2018, pp. 18–20. Mexico City, Mexico.
[13] Z. Zhang, A. Chong, Y. Pan, C. Zhang, K.P. Lam, Whole building energy model for HVAC optimal control: a practical framework based on deep reinforcement learning, Energy Build. 199 (2019) 472–490.
[14] Y. Chen, T. Hong, M.A. Piette, Automatic generation and simulation of urban building energy models based on city datasets for city-scale building retrofit analysis, Appl. Energy 205 (2017) 323–335.
[15] T.L. Hemsath, K. Alagheband Bandhosseini, Sensitivity analysis evaluating basic building geometry’s effect on energy use, Renew. Energy 76 (2015) 526–538.
[16] R. Pacheco, J. Ordo´nez, ˜ G. Martínez, Energy efficient design of building: a review, Renew. Sustain. Energy Rev. 16 (2012) 3559–3573.
[17] H. Latha, S. Patil, P.G. Kini, JIJoE, Eng. E. Influence of Architectural Space Layout and Building Perimeter on the Energy Performance of Buildings: a Systematic Literature Review, vol. 14, 2023, pp. 431–474.
[18] T. Du, S. Jansen, M. Turrin, Dobbelsteen Avd, Effect of space layouts on the energy performance of office buildings in three climates, J. Build. Eng. 39 (2021).
[19] T. Du, M. Turrin, S. Jansen, A. van den Dobbelsteen, F. De Luca, Relationship analysis and optimisation of space layout to improve the energy performance of office buildings, Energies 15 (2022).
[20] I.G. Dino, G. Üçoluk, Multiobjective design optimization of building space layout, energy, and daylighting performance, J. Comput. Civ. Eng. 31 (2017).
[21] P. Shen, Y. Li, X. Gao, Y. Zheng, P. Huang, A. Lu, et al., Recent progress in building energy retrofit analysis under changing future climate: a review, Appl. Energy 383 (2025) 125441.
[22] Y. Li, L. Li, X. Cui, P. Shen, Coupled building simulation and CFD for real-time window and HVAC control in sports space, J. Build. Eng. 97 (2024) 110731.
[23] Y. Li, L. Li, P. Shen, Probability-based visual comfort assessment and optimization in national fitness halls under sports behavior uncertainty, Build. Environ. (2023) 110596.
[24] Z. Ma, G. Jiang, Y. Hu, J. Chen, A review of physics-informed machine learning for building energy modeling, Appl. Energy 381 (2025) 125169.
[25] P.W. Tien, S. Wei, J. Darkwa, C. Wood, J.K. Calautit, Machine learning and deep learning methods for enhancing building energy efficiency and indoor environmental quality – a review, Energy AI 10 (2022) 100198.
[26] T. Du, S. Jansen, M. Turrin, A. van den Dobbelsteen, Effects of architectural space layouts on energy performance: a review, Sustainability 12 (2020).
[27] T. Dogan, C. Reinhart, P. Michalatos, Autozoner: an algorithm for automatic thermal zoning of buildings with unknown interior space definitions, J. Build. Perform. Simulat. 9 (2016) 176–189.
[28] H. Shen, A. Tzempelikos, Sensitivity analysis on daylighting and energy performance of perimeter offices with automated shading, Build. Environ. 59 (2013) 303–314.
[29] P. Delgoshaei, M. Heidarinejad, K. Xu, J.R. Wentz, P. Delgoshaei, J. Srebric, Impacts of building operational schedules and occupants on the lighting energy consumption patterns of an office space, Build. Simulat. 10 (2017) 447–458.
[30] P. Shen, W. Braham, Y. Yi, Development of a lightweight building simulation tool using simplified zone thermal coupling for fast parametric study, Appl. Energy 223 (2018) 188–214.
[31] K. Verichev, M. Zamorano, A. Fuentes-Sepúlveda, N. C´ardenas, M. Carpio, Adaptation and mitigation to climate change of envelope wall thermal insulation of residential buildings in a temperate oceanic climate, Energy Build. 235 (2021).
[32] C. Baglivo, P.M. Congedo, G. Murrone, D. Lezzi, Long-term predictive energy analysis of a high-performance building in a mediterranean climate under climate change, Energy 238 (2022).
[33] Y. Shi, Z. Yan, C. Li, C. Li, Energy consumption and building layouts of public hospital buildings: a survey of 30 buildings in the cold region of China, Sustain. Cities Soc. 74 (2021).
[34] J. Dai, S. Jiang, J. Li, X. Xu, M. Wu, The influence of layout on energy performance of university building, IOP Conf. Ser. Earth Environ. Sci. 371 (2019).
[35] T. Cheng, N. Wang, C.H. Liu, Research on energy consumption of building layout and envelope for rural housing in the cold region of China, IOP Conf. Ser. Earth Environ. Sci. 238 (2019).
[36] I. Susorova, M. Tabibzadeh, A. Rahman, H.L. Clack, M. Elnimeiri, The effect of geometry factors on fenestration energy performance and energy savings in office buildings, Energy Build. 57 (2013) 6–13.
[37] Y. Pan, M. Zhu, Y. Lv, Y. Yang, Y. Liang, R. Yin, et al., Building Energy Simulation and its Application for Building Performance Optimization: a Review of Methods, Tools, and Case Studies, vol. 10, 2023 100135.
[38] A.-T. Nguyen, S. Reiter, Rigo Pjae, A Review on Simulation-based Optimization Methods Applied to Building Performance Analysis, vol.113, 2014, pp. 1043–1058.
[39] P. Shen, Building retrofit optimization considering future climate and decision-making under various mindsets, J. Build. Eng. 96 (2024) 110422.
[40] N. Delgarm, B. Sajadi, F. Kowsary, SJAe Delgarm, Multi-objective optimization of the building energy performance: a simulation-based approach by means of particle swarm optimization, PSO) 170 (2016) 293–303.
[41] A. Vukadinovi´c, J. Radosavljevi´c, A. Đorđevi´c, M. Proti´c, N.J.S.E. Petrovi´c, Multi-Objective Optimization of Energy Performance for a Detached Residential Building with a Sunspace Using the NSGA-II Genetic Algorithm, vol.224, 2021, pp. 1426–1444.
[42] M. Ghaderian, F. Veysi, Multi-objective optimization of energy efficiency and thermal comfort in an existing office building using NSGA-II with fitness approximation, A case study 41 (2021) 102440.
[43] P. Shen, W. Braham, Y. Yi, E. Eaton, Rapid multi-objective optimization with multi-year future weather condition and decision-making support for building retrofit, Energy 172 (2019) 892–912.
[44] L. Breiman, Random Forests, 2001.
[45] M. Zeki´c-Suˇsac, A. Has, M. Kneˇzevi´c, Predicting energy cost of public buildings by artificial neural networks, CART, and random forest, Neurocomputing 439 (2021) 223–233.
[46] G.K.F. Tso, K.K.W. Yau, Predicting electricity energy consumption: a comparison of regression analysis, decision tree and neural networks, Energy 32 (2007) 1761–1768.
[47] P.P. Angelov, E.A. Soares, R. Jiang, N.I. Arnold, P.M. Atkinson, Explainable artificial intelligence: an analytical review, WIREs Data Min. Knowl. Discov. 11 (2021).
[48] R. Dwivedi, D. Dave, H. Naik, S. Singhal, R. Omer, P. Patel, et al., Explainable AI (XAI): core ideas, techniques, and solutions, ACM Comput. Surv. 55 (2023) 1–33.
[49] S.M. Lundberg, S.-I. Lee, A Unified Approach to Interpreting Model Predictions, 2017.
[50] Y. Wu, Y. Zhou, Hybrid machine learning model and shapley additive explanations for compressive strength of sustainable concrete, Constr. Build. Mater. 330 (2022).
[51] P. Meddage, I. Ekanayake, U.S. Perera, H.M. Azamathulla, M.A. Md Said, U. Rathnayake, Interpretation of machine-learning-based (Black-box) wind pressure predictions for low-rise gable-roofed buildings using shapley additive explanations (SHAP), Buildings 12 (2022).
[52] P. Arjunan, K. Poolla, C. Miller, EnergyStar++: towards more accurate and explanatory building energy benchmarking, Appl. Energy 276 (2020).
[53] M. Vega García, J.L. Aznarte, Shapley additive explanations for NO2 forecasting, Ecol. Inform. 56 (2020).
[54] P. Shen, H. Wang, Archetype building energy modeling approaches and applications: a review, Renew. Sustain. Energy Rev. 199 (2024) 114478.
[55] M. Mona, A. Reem, Effect of selecting validation dataset on building random forest and decision tree models, AlQalam J. Med. Appl. Sci. 5 (2022) 470–478.
[56] H. Bichri, A. Chergui, M. Hain, Investigating the impact of train/test split ratio on the performance of pre-trained models with custom datasets, Int. J. Adv. Comput. Sci. Appl. 15 (2024).

[57] J. Bergstra, Y. Bengio, Random search for hyper-parameter optimization, J. Mach. Learn. Res. 13 (2012) 281–305.
[58] P. Probst, M.N. Wright, A.L. Boulesteix, Hyperparameters and tuning strategies for random forest, Wiley Interdiscip. Rev.: Data Min. Knowl. Discov. 9 (2019) e1301.
[59] E. Gonzalez-Estrada, ´ W. Cosmes, Shapiro–Wilk test for skew normal distributions based on data transformations, J. Stat. Comput. Simulat. 89 (2019) 3258–3272.
[60] M.K.M. Shapi, N.A. Ramli, L.J. Awalin, Energy consumption prediction by using machine learning for smart building: case study in Malaysia, Dev. Built Environ. 5 (2021) 100037.
[61] Y. Arima, R. Ooka, H. Kikumoto, Proposal of typical and design weather year for building energy simulation, Energy Build. 139 (2017) 517–524.
[62] N. Amin, F. J´erome, ˆ vT. Christoph, Statistical methodologies for verification of building energy performance simulation, in: Proceedings of Building Simulation 2021: 17Th Conference of IBPSA, IBPSA, 2021, pp. 1719–1726.
[63] R.F. Mustapa, N.Y. Dahlan, I.M. Yassin, A.H.M. Nordin, M.E. Mahadan, Baseline energy modelling in an educational building campus for measurement and verification, in: 2017 International Conference on Electrical, Electronics and System Engineering (ICEESE), 2017, pp. 67–72.
[64] C. Fan, F. Xiao, S. Wang, Development of prediction models for next-day building energy consumption and peak power demand using data mining techniques, Appl. Energy 127 (2014) 1–10.
[65] N.-T. Ngo, A.-D. Pham, T.T.H. Truong, N.-S. Truong, N.-T. Huynh, Developing a hybrid time-series artificial intelligence model to forecast energy use in buildings, Sci. Rep. 12 (2022) 15775.
[66] E. Yaghoubi, E. Yaghoubi, A. Khamees, A.H. Vakili, A systematic review and meta-analysis of artificial neural network, machine learning, deep learning, and ensemble learning approaches in field of geotechnical engineering, Neural Comput. Appl. 36 (2024) 12655–12699.
[67] J. Long, K. Xueyuan, H. Huang, Q. Zhinian, Y. Wang, Study on the overfitting of the artificial neural network forecasting model, Acta Meteorol. Sin. 19 (2005) 216.
[68] J. Bergstra, R. Bardenet, Y. Bengio, B. K´egl, Algorithms for hyper-parameter optimization, Adv. Neural Inf. Process. Syst. 24 (2011).
[69] M.T. Ribeiro, S. Singh, C. Guestrin, Why should I trust you?, in: Explaining the Predictions of Any Classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining Association for Computing Machinery, San Francisco, California, USA, 2016, pp. 1135–1144.

Publication Details

Journal

Journal of Building Engineering

Publication Year

2026

Authors

Peiying Huang, Yanxiang Yang, Wen Gao, Xing Zheng, Pengyuan Shen

Categories

Optimization and decision making for building energy efficiency strategies