...
Excerpt | ||||
---|---|---|---|---|
This section provides a short overview of
|
What is
D s product |
---|
D s product | ||
---|---|---|
|
...
Data preparation (or data wrangling) has been a constant challenge for decades, and that challenge has only amplified as data volumes have exploded.
Note | ||||
---|---|---|---|---|
Did you know:
|
Why use
D s product |
---|
Tip |
---|
Company value: Be a multiplier. |
Estimates vary, but something like 60% of an analyst's time is consumed with preparing data for use, leaving two days per week to actually analyze it. That's expensive and inefficient.
Note | |
---|---|
Did you know: This new category of software, called data wrangling or data prep, was invented by the founders of
|
The
D s item | ||
---|---|---|
|
D s product |
---|
...
Featuring a leading-edge interface, powerful machine intelligence, and advanced distributed processing,
D s product |
---|
Note | |
---|---|
Did you know:
|
Predictive Transformation
Tip |
---|
Company value: It starts with the user. |
...
Noprint |
---|
For more information, see Overview of Predictive Transformation. |
Machine Learning
Tip |
---|
Company value: Always be learning. |
As you make selections, the platform's predictions become smarter and better. What you select today with this dataset informs the platform recommendations for transforming tomorrow's dataset.
...
The scale and complexity of these transformations can quickly overwhelm even the most powerful of machines.
D s product |
---|
D caption | ||
---|---|---|
| ||
Platform interactions and data movements |
...
The platform supports integration with:
- Cloudera and Hortonworks variants of Hadoop
- Azure Databricks
- AWS Databricks
- Amazon EMR
...
Noprint |
---|
See Supported Deployment Scenarios for Cloudera.See Supported Deployment Scenarios for Hortonworks. |
...
Datasets, recipes, and outputs can be grouped together into objects called flows. A flow is a unit of organization in the platform. Depending on your product, flows can be shared between users, scheduled for automated execution, and exported and imported into the platform. In this manner, you can build and test your recipes, chain together sets of datasets and recipes in a flow, share your work with others, and operationalize your production datasets for automated execution.
What else can you do in
D s product |
---|
In addition to the above, the following key features simplify the data prep process and bring enterprise-grade tools for managing your production wrangling efforts.
Visual Profiling
Tip |
---|
Company value: Iterate to excellence. |
For individual columns in your dataset, data histograms and data quality information immediately identify potential issues with the column. Select from these color-coded bars, and specific suggestions for transformations are surfaced for you. When you make a selection, you can optionally choose to display only the rows or columns affected by the change.
...
The platform supports automation through externally available REST APIs.
Noprint |
---|
See API Reference. |
...
Noprint |
---|
Getting StartedOverviews: Predictive Transformation | Sampling | Visual Profiling Workflow Basics: Object Model | Import | Profiling | Transform | Running Job | Export |
D s also | ||
---|---|---|
|