Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Published by Scroll Versions from space DEV and version next

D toc

d-excerpt

Learn the basics of how to import, wrangle, execute jobs, profile, and export your data from

D s product
rtrue
.

Overview

d-s-product
rtrue
 enables analysts, data specialists, and other domain experts to quickly cleanse and transform datasets of varying sizes for use in other analytics systemsthroughout the enterprise. Using an innovative set of web-based tools, you can import complex datasets and wrangle them for use in virtually any target system. Key capabilities include:

  • Import from flat file or databases or distributed storage systemsfile

  • Locate and remove or modify missing or mismatched data 
  • Unnest complex data structures
  • Identify statistical outliers in your data for review and management
  • Perform lookups from one dataset into another reference dataset
  • Aggregate columnar data using a variety of aggregation functions
  • Normalize column values for more consistent usage and statistical modeling
  • Merge datasets with joins
  • Append one dataset to another through union operations

Most of these operations can be executed with a few mouse clicks. This section provides a basic overview of common workflows through

D s product
.

...

Prerequisites

Before you begin, please verify the following:

  • D s item
    itemaccount
    : You have a 
    D s item
    itemaccount
     and can login. 

...

  • Example data: You should use a sample set of data during this workflow.

Basic Workflow

...

D s product

...

  1. Import data: Integrate data from a variety of sources of data.

    Tip

    Tip: When you login for the first time, you can immediately upload a dataset to begin transforming it.


    See 
    Import Basics.

  2. Profile your data: Before, during, and after you transform your data, you can use the visual profiling tools to quickly analyze and make decisions about your data. See Profiling Basics.
  3. Build transform recipes: Use the various views in the Transformer Page to build your transform recipes and preview the results on sampled data. See Transform Basics.Run
  4. Sample your data: In 

    D s product
    , you create your recipes while working with a sample of your overall dataset. As needed, you can take new samples, which can provide new perspectives and enhance performance in complex flows. See Sampling Basics.

  5. Run job: Launch a job to run your recipe on the full dataset. Review results and iterate as needed. See Running Job Basics.

  6. Export results: Export the generated results data for use outside of
    D s product
    . See Export Basics.

Object overview: You should review the overview of the objects that are created and maintained in 

D s product
. See Object Overview.

Example Flows

New users of

D s product
can learn by example! When a new workspace is created, the first user can access example flows through the product. You can learn more about how to build a flow and perhaps acquire some tips on how to apply to your uses.

Steps:

  1. Log in to the
    D s webapp
    .
  2. In the left navigation bar, click the Flows icon.
  3. Select any of the Example Flow flows.
  4. Each flow contains detailed notes on the various objects in the flow, as well as recommended practices for building your own flows.