1 Introduction

Concrete is known for its high compressive strength, durability, and adaptability [1, 2]. These qualities make it the material of choice for a wide range of infrastructure and building applications [3,4,5,6]. The ability of concrete to be molded into a variety of forms, has made it indispensable in construction projects across the globe [7, 8]. Not only can concrete withstand extreme loading conditions, but it can also be engineered to meet specific performance criteria [9, 10]. The availability of raw materials and their affordability ensure that concrete remains the most commonly used construction material worldwide [11]. Technological advancements have further expanded the applications of this material and improved its functional properties [12,13,14,15]. 30 billion tons of concrete is produced every year. This immense volume of concrete production is driven by several factors, including the increasing demand for infrastructure development, rapid urbanization, and population growth [16]. Concrete remains a central material in global construction [17,18,19]. India is one of the largest producers and consumers of concrete. The rapid pace of urbanization and the growth of infrastructure projects across the nation have significantly contributed to this demand [20,21,22,23]. This surge in concrete production has raised concerns regarding the sustainability of its raw material supply, particularly with respect to fine aggregates [5, 6, 24, 25]. The extraction of sand from riverbeds and quarries has led to environmental degradation. This has created a need to explore alternative materials that can replace fine aggregates without compromising the performance of concrete [26, 27]. Using dry sewage sludge in place of some of the fine particles in concrete is a viable alternative. After raw sewage is treated through primary and secondary treatment stages, the remaining sludge undergoes a dewatering process via mechanical methods such as centrifugation or belt presses [28,29,30]. The resulting semisolid material is subjected to thermal drying to reduce the moisture content [31, 32]. The application of sewage sludge as an aggregate replacement has the potential to offer a sustainable solution urban issues [33,34,35].

According to the Bangalore Water Supply and Sewerage Board (BWSSB), this wastewater treatment process results in the generation of approximately 200–250 tons of dry sewage sludge per day, depending on the treatment processes and drying efficiency. By repurposing this waste as a resource in concrete production, the city could not only address its waste management challenges but also contribute to more sustainable building practices [32, 33]. Fine particles, which are usually made of natural sand, are essential for establishing the mechanical qualities and workability of concrete [36, 37]. The partial substitution of these fine aggregates with sewage sludge at percentages of 3%, 6%, 9%, And 12% by weight offers a unique opportunity to assess the material’s impact on concrete performance. The primary objective of this study is to determine the optimal replacement percentage at which concrete exhibits its maximum strength and workability [38,39,40]. In this work, machine learning approaches are integrated to improve the analysis and forecasting of the performance of various concrete mixtures.

Nonlinear interactions among input parameters often challenge traditional statistical models in accurately predicting concrete properties. To address these limitations, the present study leverages advanced machine learning (ML) techniques specifically, multilayer perceptron (MLP), random forest (RF), and decision tree (DT) which are well-suited for capturing intricate and nonlinear relationships between concrete mix components and mechanical properties [42,43,44,45]. These models are employed to predict critical performance indicators of digested sewage sludge (DSS)-based concrete, including compressive strength, split tensile strength, flexural strength, and slump value. The MLP model, a type of deep learning architecture, consists of interconnected layers of artificial neurons that enable it to learn and model complex functional mappings between input features and target outputs. Its capacity to handle nonlinearities and high-dimensional data makes it particularly effective in predicting concrete behavior [46]. The RF model, on the other hand, employs an ensemble learning approach that aggregates predictions from multiple independent decision trees to produce a more generalized and robust outcome. This reduces the risk of overfitting and improves predictive performance, especially when working with heterogeneous datasets [43].

The DT model, while simpler, provides interpretable rule-based structures that allow for a clear understanding of the decision paths and threshold-based influence of input variables on the output responses. All three models demonstrate high predictive accuracy, with the RF model achieving an R² value of up to 0.96 for compressive strength, indicating excellent agreement between predicted and observed values. Beyond prediction, these ML models are also valuable tools for optimizing mix design by identifying the optimal ranges of ingredients needed to achieve desired performance outcomes [47,48,49,50]. To enhance model transparency and interpretability, the study integrates Shapley Additive Explanations (SHAP), an explainable AI framework that quantifies the contribution of each input variable to the final prediction. SHAP values enable the decomposition of output predictions into additive feature contributions, thus providing meaningful insights into the relative importance and influence of individual mix components on the mechanical performance of DSS-based concrete.

By combining high-accuracy predictive modeling with explainable machine learning techniques, this research not only builds upon prior foundational work in concrete technology but also advances the field toward data-driven material optimization. The dual emphasis on performance assessment and interpretability establishes a framework for developing sustainable and intelligent concrete design strategies [51,52,53,54,55,56,57].

The DSS used in this study was obtained after Anaerobic digestion And solar drying. This reduces pathogens and causes them to smell. In this work, the conservation of natural sand resources is achieved by using locally available waste materials. Therefore, progress towards SDG 12 (Responsible Consumption and Production) And SDG 11 (Sustainable Cities and Communities) is indicated. A summary of recent studies applying machine learning to concrete mix design is provided in Table 1.

Table 1 Summary of recent studies on machine learning applications in concrete

2 Materials, mixture design and methodology

2.1 Materials

2.1.1 Cement

The ordinary Portland cement (OPC) employed in this study is Grade 43, which complies with the IS 8112:1989 or IS 12269:1987 requirements. The characteristics of the cement are listed in Table 2.

Table 2 Properties of cement

2.1.2 Coarse aggregate

Crushed granite stones no larger than 20 mm are used for this project. These aggregates are sourced locally. The different characteristics of the coarse aggregates are highlighted in Table 3. The coarse aggregate particle size distribution curve is shown in Fig. 1.

Fig. 1
figure 1

Particle size distribution curve of the coarse aggregate

Table 3 Properties of the coarse and fine aggregates

2.1.3 Fine aggregate

The fine aggregate used for this research includes manufactured sand (M-Sand) and dry sewage sludge. Figure 2(a, b, and c) shows the powder sludge, SEM image of the sludge and chemical composition of the sludge, respectively. The SEM image highlights a complex and heterogeneous morphology composed of angular, irregularly shaped particles. The larger particles exhibit rough, porous surfaces and fractured edges. These textural features suggest a high surface area, which can enhance reactivity if used in cementitious systems. The fine particles adhering to the surfaces of larger aggregates may consist of ash residues or micromineral phases such as silicates, aluminates, and oxides. The loose agglomeration and lack of uniformity observed in the image suggest poor particle packing, which may influence the workability.

To decrease the moisture content to less than 10%, the sludge used in this study was subjected to the following processes. Anaerobic digestion, mechanical dewatering, And natural solar drying were performed at the Municipal treatment facility. In the laboratory, to obtain a texture suitable for use as a fine aggregate, the material was dried at 150 to 200 degrees Celsius in An oven, pounded And then crushed through a 2.36 mm sieve. Low-energy, sustainable processing was the main goal of this work. Hence, calcination was not used.

In Table 4, the characteristics of dry sewage sludge are highlighted. The presence of silicate and aluminate compounds, as well as calcium-based minerals such as calcite and gypsum, is indicated by the dominant peaks for oxygen (O), silicon (Si), aluminium (Al), and calcium (Ca). These constituents are typically found in natural sands and contribute positively to mechanical performance when dry sewage sludge is used as a partial replacement for fine aggregates in concrete. Iron (Fe) is also present in significant amounts, possibly as iron oxides or hydroxides. which may influence the colouration and durability of the composite. Minor elements such as phosphorus (P), sulphur (S), potassium (K), sodium (Na), and magnesium (Mg) are indicative of biological And industrial origins, And although present in relatively small quantities, they may affect the chemical stability and long-term performance of the concrete matrix. The particle size distribution of M-Sand complied with IS 383:2016, ensuring proper gradation for optimal workability. Table 3 highlights the various properties of the fine aggregates. Figure 3 represents the particle size distribution curve of fine aggregate and dry sewage sludge.

Table 4 Properties of dry sewage sludge
Fig. 2
figure 2

Characterization of Sludge

Fig. 3
figure 3

Particle size distribution curves of fine aggregate and dry sewage sludgeEDS analysis of dry sewage sludge (DSS) confirmed the presence of Si, Al, Ca, and Fe minerals, which contribute to the strength and durability of the concrete. The balanced presence of silicates, aluminates, and calcite supports the role of natural fine aggregates in concrete. The fine texture of sludge and high water absorption capacity have an impact on workability

2.2 Mix design

Table 5 outlines the proportions of various constituents used in the five different concrete mixes: CM (control mixture), DSS3, DSS6, DSS9, and DSS12. For all the mixtures, the quantities of cement, water, coarse aggregate, admixture, and water-cement ratio were Held constant. Each mixture contains 350 kg/m³ of cement, 175 kg/m³ of water, 1150.88 kg/m³ of coarse aggregate, And 2 kg/m³ of admixture, with a fixed W/C ratio of 0.50. In the control mixture (CM), there is no sludge content, And the fine aggregate content is fully provided at 742.05 kg/m³. For mixes DSS3, DSS6, DSS9, and DSS12 (3%, 6%, 9%, And 12%, respectively), the fine aggregate is replaced with sewage sludge, corresponding to sludge quantities of 22.26 kg/m³, 44.52 kg/m³, 66.78 kg/m³, And 89.05 kg/m³. The mix design was performed in accordance with IS 10262:2019. A concrete mixture with a ratio of 1:2.12:3.28 was obtained.

Table 5 Mix design of various mixtures

2.3 Methodology

The workflow depicted in Fig. 4 outlines a comprehensive approach that combines experimental investigations and machine learning techniques to evaluate the mechanical performance of concrete. To guarantee appropriate workability, slump flow is evaluated during the mixing process. The compressive strength, split tensile strength, and flexural strength of hardened concrete samples are the mechanical characteristics that were examined.

The data obtained from the experimental work are then utilized in the machine learning phase. This phase starts with data preparation and processing to ensure that the dataset is suitable for modelling. Linear regression, LASSO regression, decision tree regression, random forest regression, and multilayer regression are employed to predict the mechanical properties. The final step involves a thorough analysis of the model results via tools such as Taylor diagrams, regression error characteristic (REC) curves, and Shapley additive explanations (SHAP) values.

Fig. 4
figure 4

Methodology

2.3.1 Workability test

The slump test, which reflects the consistency and fluidity of the mixture, was used to determine the workability.

2.3.1.1 Slump test

A slump test was performed in compliance with IS 1199:1959 to assess the workability of the new concrete. After the cone was filled and compacted, the slump height difference between the depressed concrete and the cone was measured. After that, it was cautiously raised vertically.

2.3.1.2 Mechanical property tests

The mechanical properties of the hardened concrete were evaluated via three essential tests: compressive strength, flexural strength, and split tensile strength.

2.3.1.3 Compressive strength test

The test was conducted according to IS 516:1959. For this test, 150 × 150 × 150 mm cube samples were cast. Load was gradually increased to 140 kg/cm²/min. The compressive strength was computed, and the maximum load at failure was noted.

2.3.1.4 Flexural strength test

A concrete sample’s resistance to bending or flexural stresses is measured by its flexural strength. The test was conducted in compliance with IS 516:1959. For this test, 100 × 100 × 500 mm beam samples were cast. Following curing, the beams were positioned on two supports, and testing equipment was used to apply a two-point loading system. The failure load was used to compute the flexural strength after the load was applied at the midspan of the beam.

2.3.1.5 Split tensile strength test

It was conducted in accordance with IS 5816:1999. The split tensile strength test was conducted on cylindrical samples that were 150 mm in diameter And 300 mm in height.

3 Machine learning

3.1 Dataset preparation

The dataset utilized for the machine learning models came from a number of publications that were combined into a single table with different factors, such as compressive strength, water, aggregates, dry sewage sludge, age And cement. A final set of 185 samples was created.

3.2 Data splitting

Training and testing sets are initially created from the dataset, with the former usually being larger than the latter. Once trained on the training dataset, the model is sent to the unseen testing dataset for prediction. This is done to provide an accurate assessment of a model’s performance And gauge how well it performs on An anonymous dataset. The 70% ratio was the most successful ratio after a variety of train‒test split ratio experiments; as a result, it was used for all of the models in the study.

3.3 Performance indices

The compression strength was predicted via a variety of machine learning techniques. The prediction ability of machine learning models is evaluated and compared via four widely used performance metrics: the root mean square error (RMSE), mean absolute error (MAE), mean squared error (MSE), and coefficient of determination (R2). Every metric presents unique viewpoints on a model’s precision and effectiveness, offering a thorough evaluation of its operation.

3.3.1 Root mean square error

The square root of the average squared differences between the expected and actual values is used to calculate the RMSE, which measures the standard deviation of prediction errors, sometimes referred to as residuals. Because it is extremely sensitive to outliers and imposes heavier penalties on larger mistakes, this measure is particularly useful. Models are thought to perform better when their RMSE values are lower.

$$\:RSME=\:\sqrt{\sum\:_{i=1}^{n}\frac{{\left(y-\widehat{y}\right)}^{2}}{n}}$$

3.3.2 Mean absolute error

The MAE is a measure of the average absolute difference between the expected and actual values. Using the same units as the target variable provides a straightforward understanding of the average forecast error. The MAE is less affected by outliers than the RMSE is.

$$\:MAE=\:{\sum\:}_{i=1}^{n}\left|\frac{y-\widehat{y}}{n}\right|$$

3.3.3 Median absolute error

The average of the squared discrepancies between the expected and actual data is determined by the MSE. Like RMSE, MSE penalizes greater mistakes more heavily. Although it is helpful for comparing various models, the squared units might make it more difficult to grasp.

$$\:MedAE=median\{\:\bigcup\:_{i=1}^{n}\left|{y}_{i}-{\widehat{y}}_{i}\right|\}$$

3.3.4 R2 score

The percentage that the independent variable or variables can predict from the dependent variable’s variance is measured by R2. An ideal forecast is represented by a value of 1, whereas a value of 0 indicates that the model is no more valuable than simply taking the mean of the dependent variable. R2 is a useful metric for assessing how well the model represents the variability of the data.

These metrics present a number of viewpoints on the model’s performance. Both the MSE and the RMSE are effective for comparing models and are sensitive to large errors. The MAE can be used to interpret the average error magnitude readily. R2 indicates the explanatory power of the model relative to a baseline mean model. All of these indicators can be combined to provide a thorough evaluation of each model’s performance in predicting the compressive strength of concrete.

$$\:{R}^{2}=\:1-\frac{{\sum\:}_{i=1}^{n}{\left(y-\widehat{y}\right)}^{2}}{{\sum\:}_{i=1}^{n}{\left(y-\underset{\_}{y}\right)}^{2}}$$

3.4 Machine learning model

3.4.1 Linear regression

A simple model called linear regression uses input information to predict a continuous output. Weights and bias are adjusted during training on labelled data to reduce the mean squared error between the expected and actual values. Gradient descent techniques are typically used for this optimization to identify the ideal parameters.

$$\:\widehat{y}=\:{\theta\:}_{0}+{\theta\:}_{1}{x}_{1}+{\theta\:}_{2}{x}_{2}\dots\:\dots\:\dots\:..+{\theta\:}_{m}{x}_{m}$$
$$\:J\left(\theta\:\right)=\frac{{\sum\:}_{i=1}^{n}{\left(y-\widehat{y}\right)}^{2}}{2n}$$

3.4.2 Lasso regression

Through the addition of an L1 penalty term to the loss function, LASSO regression expands on linear regression. This enhancement facilitates feature selection by encouraging sparsity in the coefficients. Even though the training procedure is similar to linear regression, the L1 penalty may need to be managed via specific optimization techniques such as coordinate descent.

$$\:J\left(\theta\:\right)=\frac{{\sum\:}_{i=1}^{n}{\left(y-\widehat{y}\right)}^{2}}{2n}+\alpha\:{\sum\:}_{i=1}^{n}\left|{\theta\:}_{i}\right|$$

3.4.3 Random forest regression

Owing to coincidence, the idea of decision trees is expanded upon by regressions, which generate an ensemble of trees. Using a random selection of characteristics, a bootstrap sample of the data is used to train each tree at each split. Each tree’s estimates are averaged to create the final projection. Overfitting is reduced, and generalization is enhanced with this method.

3.4.4 Decision tree regression

A distinct technique is used by decision tree regressors, which lower the mean squared error within each resultant subset by recursively partitioning the input space according to characteristics. Like earlier models do, decision trees do not use gradient descent. Rather, they use greedy algorithms to determine the optimal split points, which serve as the parameters for the model.

$$\:J\left(\theta\:\right)=\:{m}_{L}{MSE}_{L}+\frac{{m}_{r}{MSE}_{R}}{Mm}$$
$$\:J\left(\theta\:\right)=\frac{{\sum\:}_{i=1}^{n}{\left(y-\widehat{y}\right)}^{2}}{2n}+\alpha\:{\sum\:}_{i=1}^{n}\left|{\theta\:}_{i}\right|$$

3.4.5 Multi-Layer perceptron

Neural networks known as multilayer perceptrons (MLPs) are capable of simulating intricate, nonlinear interactions. Each of their several networked node layers has a unique set of biases and weights. Backpropagation, a type of gradient descent that modifies parameters by directing mistakes backwards through the network, is used to train multilayer perceptrons (MLPs). The goal of the optimization procedure is to lower the mean squared error between the network’s outputs and actual values.

3.5 Correlation map

The correlation heatmap (Fig. 5) illustrates the linear relationships between various concrete mix attributes, namely, cement (C), water (W), fine aggregate (FA), a material denoted as DSS, coarse aggregate (CA), admixture (A), the water–cement ratio (WR) and compressive strength (CS). The Pearson correlation coefficient, which has a range of −1 + 1, is employed. Strong positive relationships are shown by values around + 1, strong negative relationships are indicated by values near − 1, and Little to no Linear correlation is suggested by values approximately 0. In this heatmap, CS is strongly positively correlated with Admixture (A) at 0.80 and with Coarse Aggregate (CA) at 0.64, indicating that increasing these components tends to significantly increase the compressive strength. Cement (C) also has a moderate positive correlation with CS at 0.43, as expected, due to its crucial role in strength development. Conversely, water (W) displays a moderate negative correlation with CS at −0.41, suggesting that a higher water content reduces strength, which aligns with established concrete behavior. The DSS component shows only a very weak positive correlation with CS (0.008), whereas the FA component is almost uncorrelated with CS (0.023). Notably, a potential labelling issue is evident, as water appears twice on the axes, with a strong correlation of 0.82 between the two W attributes, indicating duplication. Furthermore, DSS is strongly negatively correlated with FA (−0.68), whereas CA shows a strong negative correlation with Water (−0.73), implying inverse relationships. Overall, this heatmap helps identify which ingredients most influence compressive strength, highlighting the beneficial effects of admixtures and coarse aggregates and the detrimental impact of excess water, thereby serving as a valuable reference for optimizing concrete mix design.

3.6 Data harmonization and preprocessing

The dataset was compiled from both experimental results and supplementary Literature sources, resulting in 185 samples with variables including cement, water, fine aggregate, coarse aggregate, DSS, admixture, age (days), and compressive strength. To ensure consistency, all the data were normalized to standard units (kg/m³ for materials and MPa for strength). Outliers were identified via box plots and z scores; entries with physically implausible values or missing critical fields were removed. The features were then standardized via min–max scaling to facilitate model training and convergence, especially for models sensitive to feature magnitude (MLP).

Fig. 5
figure 5

Correlation heatmap

The box plot (Fig. 6) provides a comprehensive visualization of the distribution and variability of different concrete mix components, including cement (C), water (W), fine aggregate (FA), DSS, coarse aggregate (CA), age (A), and compressive strength (CS). Each box represents the interquartile range (IQR), with the central Line indicating the median And whiskers showing the spread of the data, while outliers are marked as individual points. The cement content ranges from approximately 320–430 kg/m³, with a fairly symmetrical distribution and no notable outliers, indicating consistent usage across mixes. The water content Lies between 140 And 200 kg/m³, indicating slight skewness but a generally stable distribution. The fine aggregate shows greater variability, ranging from 560 to 750 kg/m³, whereas the DSS values range widely from approximately 10 to 150 kg/m³, indicating that it may be used in varying quantities depending on the mix design. Coarse aggregate appears to be the most commonly used material, with values concentrated between 1150 And 1275 kg/m³, suggesting standardized application. The admixture content is low, but a few extreme outliers indicate high dosages in certain mixes. The compressive strength (CS) ranges from approximately 10–75 MPa, with several outliers indicating high-strength concrete mixtures. Interestingly, water appears twice in the plot, suggesting a possible duplication or labelling error in the dataset that needs correction. The plot reveals not only the typical ranges of each ingredient but also highlights which components are used consistently and which vary significantly, offering valuable insights into the concrete mix design practices within the dataset.

Fig. 6
figure 6

Box plots

3.7 Data preprocessing and validation strategy

To ensure consistency and comparability, all experimental and literature-based data were harmonized by conversion to standard units (kg/m³ for materials, MPa for strength). Data preprocessing involved outlier detection via boxplots and z score analysis, and min–max normalization was applied to scale features appropriately for ML model training, especially MLP. The dataset was randomly split into 70% for training And 30% for testing. Although no external datasets were available, model generalizability was evaluated via test set performance, Taylor diagrams, REC curves, and SHAP explanations. In future work, we intend to adopt k-fold cross-validation and external data sources for broader validation.

4 Results and discussion

4.1 Workability test

4.1.1 Slump test

Figure 7 compares the workability of concrete mixtures that contain dry sewage sludge (DSS) as a partial substitute for fine aggregates to that of a control mixture (CM) that does not contain any DSS. In the control mixture (CM), the slump value was approximately 95 mm. As the proportion of DSS in the concrete increased from 3 to 12%, a consistent reduction in slump was observed. Specifically, the slump decreased to approximately 85 mm at 3% DSS replacement (DSS3), followed by 73 mm at 6% (DSS6), 62 mm at 9% (DSS9), And finally 55 mm at 12% replacement (DSS12). This trend clearly demonstrates that increasing the DSS content results in a gradual decrease in the concrete workability. The decrease in slump with increasing DSS content can be due to DSS particles often have irregular shapes and rough textures, which increase internal friction among the aggregates. Among all the mixes evaluated, DSS3 (3% DSS replacement) showed the best workability, with a slump of approximately 85 mm. This level of slump still falls within An acceptable range. Therefore, 3% DSS replacement is identified as the most suitable percentage.

Fig. 7
figure 7

Slumps of different mixtures (mix vs. slump in mm)

4.2 Mechanical properties

4.2.1 Compression strength

Figure 8 shows the results of compressive strength tests conducted on concrete mixes.The control mixture (CM) exhibited the highest compressive strength at all ages, reaching approximately 29 MPa at 7 days, 36 MPa at 14 days, And peaking at 39 MPa at 28 days. When 3% DSS was introduced (DSS3), a slight reduction in strength was observed, but the performance of the concrete was still comparable to that of the control. Specifically, DSS3 reached approximately 28 MPa at 7 days, 34 MPa at 14 days, And 37 MPa at 28 days. As the DSS content increased beyond 3%, a more pronounced reduction in compressive strength was observed. At 6% replacement (DSS6), the strength values decreased further to approximately 22 MPa, 30 MPa, And 32 MPa at 7, 14, And 28 days, respectively. DSS9 had compressive strengths of approximately 20 MPa, 27 MPa, And 30 MPa, whereas DSS12 presented the lowest values of all the mixtures, reaching only 14 MPa, 19 MPa, And 23 MPa at 7, 14, And 28 days, respectively. Despite these drawbacks at higher dosages, the mixture with 3% DSS replacement (DSS3) appears to provide the most favourable balance between mechanical performance and sustainability.

Fig. 8
figure 8

Compression strength of different mixtures (mix vs. compression strength in N/mm2)

4.2.2 Split tensile strength

Figure 9 shows the results of split tensile strength testing for concrete mixtures studied in this work. The control mixture (CM) exhibited the highest tensile strength across all curing ages, reaching approximately 4.3 MPa, 5.2 MPa, And 6.6 MPa at 7, 14, And 28 days, respectively. When DSS was introduced at 3% replacement (DSS3), the tensile strength remained relatively close to the control values, with values of 4.0 MPa, 4.9 MPa, And 6.3 MPa at 7, 14, And 28 days, respectively. This marginal reduction indicates that the DSS3 mix is the most suitable alternative, maintaining an acceptable level of tensile performance while contributing to sustainability goals by recycling waste materials. At higher DSS replacement levels, the tensile strength decreased progressively. For example, DSS6 exhibited tensile strengths of 2.9 MPa, 4.0 MPa, And 5.0 MPa, whereas the tensile strengths of DSS9 decreased further to 2.3 MPa, 3.2 MPa, And 3.9 MPa, respectively. The DSS12 mixture exhibited the lowest tensile performance, with values of only 1.6 MPa, 2.1 MPa, And 2.7 MPa at 7, 14, And 28 days, respectively. The reduction in split tensile strength with increasing DSS content can be attributed to several factors. Owing to their porous structure, lower stiffness, and irregular geometry, DSS particles exhibit poor interfacial bonding with the surrounding cement paste. The DSS3 mix stands out as the most effective balance, showing only a minimal reduction in tensile strength relative to the control.

Fig. 9
figure 9

Split tensile strength of different mixtures (mix vs. split tensile strength N/mm2)

4.2.3 Flexural strength

Figure 10 shows the variation in the flexural strength of the concrete mixtures. The CM exhibited the highest flexural strength, with values of approximately 5.1 MPa, 5.9 MPa, And 7.8 MPa at 7, 14, And 28 days, respectively. The inclusion of 3% DSS (DSS3) resulted in only a slight reduction in flexural strength, reaching approximately 5.0 MPa, 6.3 MPa, And 7.2 MPa at the corresponding ages. This indicates that the DSS3 mix maintains a structural performance very close to that of the control. As the DSS content increased, the flexural strength decreased progressively. With 6% DSS (DSS6), the mixture reached 4.2 MPa, 5.1 MPa, And 6.4 MPa at 7, 14, And 28 days, respectively. The performance of the DSS9 mixture further decreased to 3.2, 4.3, And 5.6 MPa, whereas that of the DSS12 mixture was the lowest, at 2.2, 3.8, And 4.8 MPa over the same periods.

Fig. 10
figure 10

Flexural strength of different mixtures (mix vs. flexural strength in N/mm2)

4.3 Machine learning

The descriptive statistics for the dataset (Table 6) offer insights into the distribution and variability of key variables affecting the compressive strength (CS) of concrete. The dataset includes 185 concrete mix records with various mechanical parameters. Cement (C) content averaged 371.71 kg/m³ with a low standard deviation (SD = 23.48), indicating consistency. Water (W) content averaged 160.21 kg/m³, ranging from 140 to 190 kg/m³, contributing to a water–cement ratio (WR) of 0.42 (SD = 0.048), a critical factor influencing strength and durability.

Fine aggregate (FA) averaged 675.54 kg/m³ with moderate variability, partly due to partial replacement with dry sewage sludge (DSS). DSS, the main variable in the study, had a mean of 47.26 kg/m³ (SD = 33.88), ranging from 0 to 143 kg/m³. High DSS levels tended to reduce strength due to increased porosity and hydration disruption, while low levels showed possible pozzolanic effects. Coarse aggregate (CA) was consistent (mean = 1183.07 kg/m³, SD = 51.18), while admixture (A) showed significant variability (mean = 21.85 kg/m³, SD = 19.60). The target output, CS, ranged from 10.60 to 70.93 MPa with a mean of 31.14 MPa. The dataset captures a broad strength spectrum, supporting robust analysis of DSS effects.

Table 6 Statistical analysis of the dataset

The comparison of regression models for predicting compressive strength (CS) (Table 7) highlights key performance differences. Among the Linear models, ridge regression performs best, with An RMSE of 4.23, MSE of 17.90, MAE of 3.13, and an R² of 0.88, indicating its effectiveness in addressing multicollinearity while maintaining predictive accuracy. Linear regression shows similar metrics, suggesting a relatively linear relationship between features and CS. In contrast, LASSO regression, which uses L1 regularization to remove less relevant features, yields weaker results (RMSE = 4.97, R² = 0.83), Likely due to underfitting from excessive feature elimination. Among nonlinear models, the decision tree regressor significantly improves performance, achieving An RMSE of 2.75, MAE of 1.96, and R² of 0.95, reflecting its strength in modeling complex, nonlinear relationships. The random forest regressor, an ensemble of decision trees, further enhances predictive power with the lowest RMSE (2.35), MSE (5.52), MAE (1.75), and the highest R² (0.96), making it the most robust model in this study. The multilayer perceptron (MLP) also performs well (RMSE = 3.06, R² = 0.94), demonstrating its capability to learn nonlinear patterns, though it may require more computational effort and tuning.

Overall, the results show that while linear models offer a solid baseline, nonlinear models particularly random forest and MLP are better suited for predicting concrete compressive strength due to their superior ability to capture complex interactions. Figure 11 illustrates the true vs. predicted CS values for decision tree, random forest, and MLP regressors.

Table 7 Comparison of regression models
Fig. 11
figure 11

True vs. Predicted (a) decision tree regression, (b) random forest regression and (c) MLP regression

Figure 12 illustrates the relationships between different values of K in the K-nearest neighbors (KNN) algorithm and their corresponding mean errors, which are typically derived from cross-validation. The X-axis represents the number of neighbors (K values) considered in the algorithm, ranging from 1 to over 40, whereas the Y-axis denotes the mean error rate associated with each K value. This type of plot is often used to determine the optimal K value that minimizes prediction error and maximizes model performance. The plot clearly shows that the mean error is the lowest when K = 2, with a value slightly less than 0.90, indicating the most accurate prediction among all the tested K values. At K = 1, the mean error is slightly greater, and for K ≥ 3, the mean error dramatically increases And stabilizes at 1.0, suggesting poor performance and misclassification or prediction failure. This behavior is unusual and signifies that the KNN model performs well only with very small K values and fails completely when more neighbors are included. This may be due to the nature of the data such as high dimensionality, sparsity, or unnormalized feature scales causing the model to lose discriminatory power when averaging over too many neighbors. The significance of this analysis lies in model tuning and generalization. Choosing an appropriate K value is critical for KNN, as a small K (such as 1 or 2) can lead to overfitting (capturing noise in training data), whereas a large K can lead to underfitting (over smoothing and losing important distinctions). In this case, the sharp rise in error beyond K = 2 indicates that only a very small neighborhood yields useful predictions for this dataset, and further investigation into the dataset’s distribution or standardization might be warranted.

Fig. 12
figure 12

K-nearest neighbor (KNN) algorithm

Figure 13 displays the feature importance scores derived from a decision tree regressor, illustrating the relative influence of each input variable on the model’s predictions. In decision trees, feature importance is calculated based on the extent to which each feature reduces impurity such as mean squared error across all decision nodes. Higher scores indicate greater influence on the model’s performance. In this Analysis, feature W has the highest importance score of 0.64, making it the most influential predictor of the target output, likely related to concrete performance such as compressive strength. This suggests water content plays a dominant role in the model’s decision-making, which is consistent with established concrete science emphasizing water’s critical role in strength And durability. Feature A follows with a moderate importance score of 0.21, and FA (fine aggregate) contributes 0.10, indicating a secondary yet relevant influence. Other variables, such as a second instance of W, C (cement), and DSS (dry sewage sludge), have minimal importance (0.03–0.01), suggesting a limited effect on predictions. CA (coarse aggregate) has a score of 0.00, indicating it was not used in any decision splits and thus has no impact in this model. This feature importance analysis helps prioritize key input variables for future data collection, feature selection, and material optimization. Features with negligible contributions can be excluded in subsequent models to simplify computation and enhance efficiency without sacrificing accuracy.

Fig. 13
figure 13

Feature importance scores determined by a decision tree regressor

The Taylor diagram (Fig. 14) offers a comprehensive visual comparison of linear regression, LASSO regression, ridge regression, decision tree, and random forest models based on their statistical alignment with observed data. It combines three key metrics: standard deviation (radial distance), correlation coefficient (angular position), and centered root mean square difference (curved contours). The ideal model lies closest to the reference point on the x-axis, indicating high correlation and matching variability. In this study, linear and ridge regression models show the best performance, being nearest to the reference point with high correlation and low RMS error. LASSO, decision tree, and random forest models are positioned slightly farther, indicating lower alignment with the experimental data. This diagram is particularly valuable for evaluating predictive model performance in sustainable self-compacting concrete research, which involves complex variable interactions due to the use of biomaterials and recycled aggregates. By identifying models with the closest statistical match to observed results, the Taylor diagram aids in selecting the most reliable approach for accurate property prediction. This enhances confidence in the study’s simulations and design outcomes, thereby strengthening the validity and applicability of the research findings in sustainable construction practices.

Fig. 14
figure 14

Taylor diagram

The regression error characteristic (REC) curve (Fig. 15) provides a comprehensive evaluation of the predictive performance of three regression models random forest, decision tree, and multilayer perceptron (MLP) on both training and testing datasets. The x-axis denotes the error tolerance, while the y-axis represents cumulative accuracy, reflecting the proportion of predictions within a specific error range. Models with curves that rise steeply and achieve higher cumulative accuracy at lower error tolerances are considered more reliable. In this analysis, the random forest model demonstrates superior performance on training data (MAE = 0.08) and maintains good generalizability on test data (MAE = 0.26), making it the most balanced model. The decision tree model shows perfect accuracy on training data (MAE = 0.00) but suffers from overfitting, as evident from its higher test error (MAE = 0.31). The MLP regressor performs moderately well, with training And testing MAEs of 0.16 And 0.27, respectively, indicating better generalizability than the decision tree. The REC curve is valuable for visually comparing model accuracy over varying error tolerances and for understanding the trade-off between training performance and model generalization. In the context of predicting the properties of sustainable self-compacting concrete incorporating biomaterials and recycled asphalt pavement aggregates, such insights are critical. Accurate and generalizable predictions are essential for optimizing mix designs and ensuring consistent performance. The random forest model, with its robust predictive capability, emerges as the most suitable for applications in sustainable construction.

Fig. 15
figure 15

Regression error characteristic (REC) curves

The Shapley additive explanations (SHAP) plot (Fig. 16) offers a detailed interpretation of how each input feature affects the output pred ictions of the machine learning model used to evaluate self-compacting concrete performance. The plot ranks variables such as Days, Fine Aggregate, Water-Cement ratio, Cement, Coarse Aggregate, Sludge, and Water based on their contribution to the model’s output. The x-axis shows SHAP values, indicating whether a feature increases (positive) or decreases (negative) the predicted strength. A colour gradient represents feature magnitude blue for low values and red for high. Among the features, curing time (Days) has the most substantial influence; higher curing durations (shown in red) significantly enhance predicted concrete strength. Cement and coarse aggregate also play key roles, with higher contents boosting strength predictions, consistent with their known contributions to strength development. In contrast, a high water–cement ratio negatively impacts the output, reflecting the weakening effect of excess water. Sludge, water, and fine aggregate exhibit moderate influences, indicating their relevance but lesser dominance. This SHAP analysis is essential for bringing transparency to complex model predictions. It reveals not only which features matter most but also how their variation affects outcomes. This interpretability is especially critical when dealing with unconventional materials like sludge and recycled aggregates, ensuring alignment with established material science principles. In this study, SHAP enhances confidence in the model’s validity and supports informed mix design optimization. Ultimately, it confirms the potential for integrating biomaterials and recycled components in sustainable concrete, aiding the development of eco-friendly, high-performance construction materials.

Fig. 16
figure 16

SHAP (Shapley additive explanations) summary plot

5 Conclusion

This study scientifically evaluated the use of dry sewage sludge (DSS) as a sustainable fine aggregate replacement in concrete and assessed its mechanical performance via both experimental methods and machine learning (ML) models.

  • The experimental results clearly indicated that replacing 3% of fine aggregates with dry sewage sludge (DSS) offers a practical trade-off between sustainability and mechanical performance.

  • At this dosage, the concrete exhibited only minor reductions in workability And strength, with the slump decreasing from 95 mm to 85 mm And the compressive strength dropping slightly from 39 MPa to 37 MPa. Similar marginal decreases were also observed in the split tensile And flexural strengths at the 3% DSS level.

  • These reductions are attributed primarily to the higher porosity and water absorption capacity of the DSS, which affects the water-to-cement ratio.

  • At higher replacement levels (≥ 6%), the decline in strength became more pronounced due to poor particle packing, increased internal friction.

  • SEM and EDS analyses confirmed the presence of silicates and aluminates in the DSS, which may contribute to increased pozzolanic activity and strength at later ages when used at lower replacement levels.

  • The application of machine learning further reinforced the experimental findings, particularly through the random forest regressor, which achieved an R² value of 0.96 And An RMSE of 2.35.

  • SHAP analysis revealed that the curing age was the most influential factor in predicting the compressive strength, followed by the cement dosage and the W/C ratio.

  • The DSS content was found to have a moderate influence on strength prediction, supporting its role as a viable secondary component in sustainable concrete mix design when used in controlled quantities.

In a broader context, this work bridges experimental materials science with data-driven modelling, highlighting that intelligent mix design informed by ML can accelerate the adoption of sustainable alternatives such as DSS. It also contributes to the growing body of research that promotes circular economy strategies in construction by valorising waste from urban wastewater treatment plants.

6 Limitations and future work

Although dry sewage sludge (DSS) offers potential as a sustainable partial replacement for fine aggregates in concrete, its use is Limited by concerns over long-term durability, chemical resistance, And environmental performance. DSS properties vary significantly based on source, treatment process, And seasonal factors. Due to its porous structure and high variability, replacement levels beyond 3% often lead to reduced workability and strength unless modified with admixtures or supplementary materials. Unlike previous studies such as Singh et al. [23], which supported up to 10% DSS replacement without predictive tools, this study employs machine learning (ML) models random forest, decision tree, and multilayer perceptron with SHAP explanations to optimize mix design and improve predictive accuracy. Few prior studies have applied such explainable ML methods to DSS-based concrete. The integration of SHAP provides transparent insights into feature importance for compressive strength prediction. This computational approach enhances the understanding of DSS behavior and assists in performance optimization. Further research should investigate the long-term durability and microstructure of DSS-modified concrete. Exploring the use of chemical admixtures or fiber reinforcements could enhance mix performance at higher DSS levels. Field trials are also recommended to assess real-world behavior under varied environmental and loading conditions.