Analysis methods using ML #375

EwoutH · 2024-12-02T14:27:56Z

EwoutH
Dec 2, 2024
Collaborator

Exploratory Modeling and Analysis (EMA) generates large datasets of scenarios, policies, and outcomes that capture complex system behaviors and uncertainties. Machine learning could be well-suited for certainty types of results analysis, because it excels at finding patterns in high-dimensional data, quantifying uncertainties, and creating fast approximations of complex relationships.

Here a some ideas that might be interesting to explore:

Deep Learning Surrogate Models: Enable users to train neural networks that approximate computationally expensive models. This would allow rapid exploration of the scenario/policy space and support additional analysis like gradient-based sensitivity studies. The API would be straightforward: surrogate = ml.create_surrogate('outcome_name') followed by predictions = surrogate.predict(new_scenarios).
Automated Scenario Clustering: Help users identify groups of similar scenarios or policies using modern clustering techniques. The API would automatically determine optimal cluster counts and provide both hierarchical and density-based clustering options. Users could simply call clusters = ml.cluster_experiments() to get meaningful groupings of their results.
Interpretable Feature Importance: Use SHAP (SHapley Additive exPlanations) values to provide detailed insights into how different uncertainties and levers influence outcomes. This would extend beyond basic sensitivity analysis by capturing non-linear interactions. The API would be importance = ml.analyze_importance('outcome_name').
Anomaly Detection: Automatically identify unusual or extreme scenarios that might deserve special attention. This could help focus computational resources and highlight potential edge cases. Usage would be as simple as anomalies = ml.detect_anomalies().
Uncertainty Quantification: Use Gaussian Process Regression to predict outcomes for new scenarios while quantifying uncertainty in those predictions. This would help users understand both expected outcomes and their confidence bounds. The API would be mean, std = ml.predict_with_uncertainty(new_scenarios).
Policy Sensitivity Analysis: Leverage neural network gradients to analyze how sensitive policies are to different uncertainties. This would provide detailed insights into policy robustness. Users could call sensitivity = ml.analyze_policy_sensitivity('policy_id').
Automated ML Optimization: Use modern AutoML techniques to automatically find the best ML models and hyperparameters for analyzing particular outcomes. This would make advanced ML accessible to all users through a simple best_model = ml.auto_optimize('outcome_name') interface.

Most of this only needs scikit-learn. All capabilities would should directly with the standard experiments DataFrame and outcomes dictionary returned by perform_experiments().

@quaquel Curious what you find useful. I think 1 is very interesting. Especially determining how well a surrogate model would fit. If it's like 99.9% it probably means an ML model can describe a system very well, while lower scores means it's either more (stochastically) noisy or more complext than an ML model could easily fit to.

2, 3 and 4 seem like useful analysis functions to have. 5 and 7 are somewhat extensions of 1, and 6 seems to be highly complex.

If you could pick one of these, what would you find most useful?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analysis methods using ML #375

{{title}}

Replies: 0 comments

Select a reply

Analysis methods using ML #375

EwoutH Dec 2, 2024 Collaborator

Replies: 0 comments

EwoutH
Dec 2, 2024
Collaborator