I have the following problem:
Given inputs $x$ ($n$-dimensional vector) of scalars, ordered integers and
unordered integers (i.e., labels) and one or several outputs $y$, I would
like to estimate:
- Which inputs explain the best the outputs.
- To what extent variations of one input imply variations of the outputs.
This is supposed to be related to uncertainty and sensitivity analysis, which are quite broad fields. Do you know of any methods/resources with an approach related to my problem?
You can try one of the tools provided here. That is matlab solutions, very nice code and modern methods. Firstly I would suggest you to try graphical tools from the library to make sense about the data.
As you did not provide the details on what you need here are some comments on the methods implied:
Global Sensitivity Analysis.
Global sensitivity analysis is the study of how the uncertainty in the output of a model (numerical or otherwise) can be apportioned to different sources of uncertainty in the model input. Global could be an unnecessary specification here, were it not for the fact that most analysis met in the literature are local or one-factor-at-a-time.
Monte-Carlo (or Sample-based) Analysis. Monte Carlo (MC) analysis is based on performing multiple evaluations with randomly selected model input, and then using the results of these evaluations to determine both uncertainty in model predictions and apportioning to the input factors their contribution to this uncertainty. A MC analysis involves the selection of ranges and distributions for each input factor; generation of a sample from the ranges and distributions specified in the first step; evaluation of the model for each element of the sample; uncertainty analysis and sensitivity analysis.
Response Surface Methodology. This procedure is based on the development of a response surface approximation to the model under consideration. This approximation is then used as a surrogate for the original model in uncertainty and sensitivity analysis.
The analysis involves the selection of ranges and distributions for each input factor, the development of an experimental design defining the combinations of factor values on which evaluate the model, evaluations of the model, construction of a response surface approximation to the original model, uncertainty analysis and sensitivity analysis.
Screening Designs. Factors screening may be useful as a first step when dealing with a model containing a large number of input factors (hundreds). By input factor we mean any quantity that can be changed in the model prior to its execution. This can be a model parameter, or an input variable, or a model scenario. Often, only a few of the input factors and groupings of factors, have a significant effect on the model output.
Local (Differential Analysis). Local SA investigates the impact of the input factors on the model locally, i.e. at some fixed point in the space of the input factors. Local SA is usually carried out by computing partial derivatives of the output functions with respect to the input variables (differential analysis). In order to compute the derivative numerically, the input parameters are varied within a small interval around a nominal value. The interval is not related to our degree of knowledge of the variables and is usually the same for all of the variables.
FORM-SORM. FORM and SORM are useful methods when the analyst is not interested in the magnitude of Y (and hence its potential variation) but in the probability of Y exceeding some critical value. The constraint (Y-Ycrit < 0) determines a hyper-surface in the space of the input factors, X. The minimum distance between some design point for X and the hyper-surface is the quantity of interest.