Herramienta Red neuronal
Ejemplo de cada herramienta
La herramienta Id. del registro posee un ejemplo de cada herramienta. Visita Flujos de trabajo de muestra para aprender cómo acceder a este y muchos otros ejemplos directamente en Alteryx Designer.
La herramienta Red neuronal crea un modelo de red neuronal prealimentada perceptrón de una sola capa oculta. Las neuronas de la capa oculta utilizan una función de activación sigmoidal, y la función de activación de salida depende de la naturaleza del campo de destino. Specifically, for binary classification problems (e.g., the probability a customer buys or does not buy), the output activation function used is logistic, for multinomial classification problems (e.g., the probability a customer chooses option A, B, or C) the output activation function used is softmax, for regression problems (where the target is a continuous, numeric field) a linear activation function is used for the output.
Neural networks represent the first machine learning algorithm (as opposed to traditional statistical approaches) for predictive modeling. The motivation behind the method is mimicking the structure of neurons in the brain (hence the method's name). The basic structure of a neural network involves a set of inputs (predictor fields) that feed into one or more "hidden" layers, with each hidden layer having one or more "nodes" (also known as "neurons").
In the first hidden layer, the inputs are linearly combined (with a weight assigned to each input in each node), and an "activation function" is applied to the weighted linear combination of the predictors. In the second and subsequent hidden layers, output from the nodes of the prior hidden layer are linearly combined in each node of the hidden layer (again with weights assigned to each node from the prior hidden layer), and an activation function is applied to the weighted linear combination. Finally, the results from the nodes of the final hidden layer are combined in a final output layer that uses an activation function that is consistent with the target type.
Estimation (or "learning" in the vocabulary of the neural network literature) involves finding the set of weights for each input or prior layer node values that minimize the model's objective function. In the case of a continuous numeric field this means minimizing the sum of the squared errors of the final model's prediction compared to the actual values, while classification networks attempt to minimize an entropy measure for both binary and multinomial classification problems. As indicated above, the Neural Network tool (which relies on the R nnet package), only allows for a single hidden layer (which can have an arbitrary number of nodes), and always uses a logistic transfer function in the hidden layer nodes. Despite these limitations, our research indicates that the nnet package is the most robust neural network package available in R at this time.
While more modern statistical learning methods (such as models produced by the Boosted, Forest, and Spline Model tools) typically provide greater predictive efficacy relative to neural network models, in some specific applications (which cannot be determined before the fact), neural network models outperform other methods for both classification and regression models. Moreover, in some areas, such as in financial risk assessment, neural network models are considered a "standard" method that is widely accepted. Esta herramienta utiliza la herramienta R. Dirígete a Opciones> Descargar herramientas predictivase inicia sesión en el portal de Descargas y licencias de Alteryxpara instalar R y los paquetes utilizados por la herramienta R. Visita Descargar y usar herramientas predictivas.
Configurar la herramienta
Nombre del modelo: cada modelo debe tener un nombre para su posterior identificación. Los nombres del modelo deben comenzar con una letra y pueden contener letras, números y los caracteres especiales de punto (“.”) y guion bajo (“_”). No se permite el uso de otros caracteres especiales. Además, R distingue entre mayúsculas y minúsculas.
Selecciona la variable objetivo: selecciona el campo del flujo de datos que deseas predecir. Este objetivo debe ser de tipo cadena.
Selecciona los campos predictores: selecciona los campos del flujo de datos que crees que “causan” los cambios en el valor de la variable objetivo. Las columnas que contienen identificadores únicos, como claves primarias sustitutas y claves primarias naturales, no deben utilizarse en análisis estadísticos. No tienen ningún valor predictivo y pueden causar excepciones en tiempo de ejecución.
Utilizar ponderaciones de muestreo para la estimación del modelo: haz clic en la casilla y, luego, selecciona un campo de ponderación del flujo de datos para estimar un modelo que utilice ponderación de muestreo.
The number of nodes in the hidden layer: The number of nodes (neurons) in the model's single hidden layer. The default is ten.
¿Incluir gráficos de efecto marginal?: una opción para incluir gráficos en el informe que muestran la relación entre la variable predictora y el objetivo, haciendo un promedio sobre el efecto de otros campos predictores. The number of plots to produce is controlled by "The minimal level of importance of a field to be included in the plots," which indicates the percentage of the total predictive power of the model a particular field must contribute to the model in order to have a marginal effect plot produced for that field. The higher the value for this selection reduces the number of marginal effects plots produced.
Custom scaling/normalization...: The numeric methods underlying the optimization of the model's weights can be problematic if the inputs (predictor fields) are on different scales (e.g., income which ranges from seven thousand to one million combined with the number of members present in the household that ranges from one to seven).
Ninguno (predeterminado)
Z-score: All predictor fields are scaled so that they have a mean of zero and a standard deviation of one.
Unit interval: All predictor fields are scaled so that they have a minimum value of zero and a maximum value of one, with all other values being between zero and one.
Zero centered: All predictor fields are scaled so that they have a minimum value of negative one and a maximum value of one, with all other values being between negative and positive one).
The weight decay: The decay weight limits the movement in the new weight values at each iteration (also called "epoch") of the estimation process. The value of the decay weight should be between zero and one, larger values place a greater restriction of the possible movements of the weights. In general, a weight decay value of between 0.01 and 0.2 often works well.
The +/- range of the initial (random) weights around zero: The weights given to the input variables in each hidden node are initialized using random numbers. This option allows the user to set the range of the random numbers used. Generally, the values should be near 0.5. However, smaller values can be better if all the input variables are large in size. A value of 0 is actually a special value that causes the tool to find a good comprise value given the input data.
The maximum number of weights allowed in the model: This option becomes relevant when there are a large number of predictor fields and nodes in the hidden layer. Reducing the number of weights speeds up model estimation, and also reduces the chance that the algorithm finds a local optimum (as opposed to a global optimum) for the weights. Weights excluded from the model are implicitly set to zero.
The maximum number of iterations for model estimation: This value controls the number of attempts the algorithm can make in attempting to find improvements in the set of model weights relative to the previous set of weights. If no improvements are found in the weights prior to the maximum number of iterations, the algorithm will terminate and return the best set of weights. This option defaults to 100 iterations. In general, given the behavior of the algorithm, it is likely to make sense to increase this value if needed, at the cost of lengthening the runtime for model creation.
Tamaño del gráfico: selecciona pulgadas o centímetros para el tamaño del gráfico.
Resolución del gráfico: selecciona la resolución del gráfico en puntos por pulgada: 1x (96 ppp); 2x (192 ppp); o 3x (288 ppp).
La resolución más baja crea un archivo más pequeño y es mejor para ver en un monitor.
Una resolución más alta crea un archivo más grande con una mejor calidad para imprimir.
Tamaño de fuente base (puntos): selecciona el tamaño de la fuente del gráfico.
Ver la salida
Ancla O: objeto. Consta de una tabla del modelo serializado con el nombre del modelo.
Ancla R: informe. Consta de los fragmentos de informe generados por la herramienta Clasificador bayesiano simple, como un resumen básico del modelo, así como gráficos de efectos principales para cada clase de la variable objetivo.