Version 2: This section describes how you can import JSON files into |
The basic task is described by way of example. In the example task, the JSON file must be imported into , a new column must be inserted into the JSON, and the resulting JSON must be exported in the same structure.
Each JSON record must be less than 20 MB in size.
.json
or .JSON
. For best results, all keys and values should be quoted and imported as strings.
NOTE: Escape characters that make JSON invalid can cause your JSON file to fail to import. |
You can escape quote values to treat them as literals in your strings using the backslash character. For example: \"
JSON structure | Description | Supported? | |
---|---|---|---|
Newline | The newline character (
| Supported | |
Top-level object | Top-level row contains keys for mapping JSON objects | Supported | |
Top-level array | Top-level row contains array of objects | Supported |
NOTE: JSON-formatted files that are generated by |
\n
character. If your downstream system is expecting comma-delineated records except for the last one, additional work is required outside of the application.The following example contains records of images from a website:
{"metrics": [{"rank": "1043", "score" : "9679"}], "caption": "Such a good boy!", "id": "9kt8ex", "url": "https://www.example.com/w285fpp11.jpg", "filename": "w285fpp11.jpg"} {"metrics": [{"rank": "1042", "score" : "9681"}], "caption": "This sweet puppy has transformed our life!", "id": "9x2774", "url": "https://www.example.com/fmll0cy11.jpg", "filename": "fmll0cy11.jpg"} {"metrics": [{"rank": "1041", "score" : "9683"}], "caption": "We sure love our fur babies.", "id": "a8guou", "url": "https://www.example.com/mljnmq521.jpg", "filename": "mljnmq521.jpg"} |
Notes:
Each row is a complete JSON record containing keys and values.
Tip: Nested JSON, such as |
Each key's value must have a comma after it, except for the final key value in any row.
NOTE: The end of a JSON record is the right curly bracket (}). Commas are not added to the end of each line in this format. |
Import the JSON file.
Through the Import Data page, navigate and select your JSON file for import.
NOTE: File formats are detected based on the file extension. Please verify that your file extension is |
The file is passed through the conversion process, which reviews the JSON file and stores it on the base storage layer in a format that can be easily ingested as in row-per-record format. This process happens within the Import Data page. You can track progress on the right side of the screen.
After the file has been converted, click the Preview icon on the right side of the screen. In the Preview, you can review the first few rows of the imported file.
If some rows are missing from the preview, then you may have a syntax error in the first row after the last well-structured row. You should try to fix this in source and re-import.
If all of the rows are problematic, your data is likely malformed.
Complete the rest of the import process.
In the Transformer page, the example above should look like the following:
metrics | caption | id | url | filename |
---|---|---|---|---|
[{"rank":"1043","score":"9679"}] | Such a good boy! | 9kt8ex | https://www.example.com/w285fpp11.jpg | w285fpp11.jpg |
[{"rank":"1042","score":"9681"}] | This sweet puppy has transformed our life! | 9x2774 | https://www.example.com/fmll0cy11.jpg | fmll0cy11.jpg |
[{"rank":"1041","score":"9683"}] | We sure love our fur babies. | a8guou | https://www.example.com/mljnmq521.jpg | mljnmq521.jpg |
Your JSON records are in tabular format. If you have nested JSON objects within your JSON records, the next step is to unnest your JSON records.
NOTE: For JSON records that have multiple levels in the hierarchy, you should unnest the top level of the hierarchy first, followed by each successive level. |
Tip: The easiest way to unnest is to select the column header for the column containing your nested data. Unnest should be one of the suggested options, and the suggestion should include the specification for the paths to the key values. If not, you can use the following process. |
unnest values into new columns
. Specify the following transformation. Substitute the Paths to elements values below with the top-level keys in your JSON records:
Tip: You can choose to remove the original from the source or not. In deeper or wider JSON files, removing can help to identify what remains to be unnested. |
So, this step breaks out the key-value pairs for the specified keys into separate columns in the dataset.
You can now delete the source columns. In the example, these source columns are named metrics
and 0
.
Tip: SHIFT + click these columns and then select Delete columns from the right panel. Click Add. |
Repeat the above steps for each nested JSON object.
Tip: If the above set of steps needs to be applied to multiple files, you might consider stopping your work and returning to Flow View. Select this recipe and click Add New Recipe. If you add successive steps in another recipe, the first one can be used for doing initial processing of your JSON files, separate from any wrangling that you may do for individual files. |
Tip: The unnesting process may have moved some columns into positions that are different from their order in the original JSON. Use the Move command from the column menu to reposition your columns. |
Your JSON data is ready for wrangling. Continue adding steps until you have transformed your data as needed and are ready to run a job on it.
NOTE: If your desired JSON output does not include multiple hierarchies, you can skip this section. The generated JSON files are a single JSON record per row. |
If you ran a job on the example dataset, the output would look like the following:
{"rank":1043,"score":9679,"caption":"Such a good boy!","id":"9kt8ex","url":"https://www.example.com/w285fpp11.jpg","filename":"w285fpp11.jpg"} {"rank":1042,"score":9681,"caption":"This sweet puppy has transformed our life!","id":"9x2774","url":"https://www.example.com/fmll0cy11.jpg","filename":"fmll0cy11.jpg"} {"rank":1041,"score":9683,"caption":"We sure love our fur babies.","id":"a8guou","url":"https://www.example.com/mljnmq521.jpg","filename":"mljnmq521.jpg"} |
Suppose you want to nest the url
and filename
columns into a nested array called, resources
.
Re-nest the lower hierarchies until have you have a single flat record, containing some Object type columns that hold the underlying hierarchies. When the re-nested JSON records are exported, secondary hierarchies appear as escaped string values. More details later.
Tip: The following steps reshape your data. You may wish to create a new recipe as an output of the previous recipe where you can add the following steps. |
Steps:
url
and filename
columns. Then, select Nest columns in the right-hand panel. This transformation should look like the following:column1
now contains an Object mapping of the two columns. You can now nest this column again into an Array:column1
.You must re-nested from the bottom of the target hierarchy to the top.
NOTE: Do not nest the columns at the top level of the hierarchy. |
When you are ready, you can run the job. Create or modify a publishing action to generate a JSON file for output. See Run Job Page.
When the job completes, you can click the JSON link in the Output Destinations tab of the Job Details page to download your JSON file. See Job Details Page.
Output file for the above example should look like the following:
{"rank":1043,"score":9679,"caption":"Such a good boy!","id":"9kt8ex","url":"https://www.example.com/w285fpp11.jpg","filename":"w285fpp11.jpg","resources":[{"url":"https://www.example.com/w285fpp11.jpg","filename":"w285fpp11.jpg"}]} {"rank":1042,"score":9681,"caption":"This sweet puppy has transformed our life!","id":"9x2774","url":"https://www.example.com/fmll0cy11.jpg","filename":"fmll0cy11.jpg","resources":[{"url":"https://www.example.com/fmll0cy11.jpg","filename":"fmll0cy11.jpg"}]} {"rank":1041,"score":9683,"caption":"We sure love our fur babies.","id":"a8guou","url":"https://www.example.com/mljnmq521.jpg","filename":"mljnmq521.jpg","resources":[{"url":"https://www.example.com/mljnmq521.jpg","filename":"mljnmq521.jpg"}]} |
Your JSON may be formatted as a single top-level object containing an array of JSON records. The following example contains records of messages about individual diet and exercise achievements:
{ "object": [ { "score": 19669, "title": "M/07/1'3\" [23lbs > 13lbs = 10lbs] Still a bit to go, but my owner no longer refers to me as his chunky boy!", "ups": 19669, "id": "9kt8ex", "url": "https://i.redd.it/bzygw285fpp11.jpg", "short": "bzygw285fpp11.jpg" }, { "score": 19171, "title": "M/29/5'11\" [605 pounds > 375 pounds = 230 pounds lost] (14 months) Still considered super morbidly obese but I've made some good progress.", "ups": 19171, "id": "9x2774", "url": "https://i.redd.it/wbbufmll0cy11.jpg", "short": "wbbufmll0cy11.jpg" }, { "score": 16778, "title": "F/28/5\u20197\u201d [233lbs to 130lbs] Got tired of being obese and took control of my life!", "ups": 16778, "id": "a8guou", "url": "https://i.redd.it/3t0kmljnmq521.jpg", "short": "3t0kmljnmq521.jpg" }, { "score": 16743, "title": "M/22/5'11\" [99lbs > 150lbs = 51lbs] Anorexia my recovery", "ups": 16743, "id": "atla3n", "url": "https://i.redd.it/9t6tvsjs16i21.jpg", "short": "9t6tvsjs16i21.jpg" } ] } |
The outer JSON is a single key-value pair:
object
When source JSON records structured in this manner are imported, each JSON record in the object is imported into a separate row. You can unnest this data by applying an Unnest values transformation.
NOTE: The object can contain only one nested array of JSON data. If the object contains multiple nested arrays, it is not not broken into separate rows. All unnesting must be performed in your recipe steps |
Suppose you want to compute the average of all workout scores. First, you must unnest the JSON records and then apply the AVERAGE
function.
Steps:
Tip: The easiest way to unnest is to select the column header for the column containing your data. After you select the column header, you are provided with suggestions to Unnest Values into new columns. You can use the Unnest suggestion and click Add. The following steps illustrate how to create this transformation manually. |
unnest values into new columns
. Specify the following transformation. Substitute the Paths to elements values below with the top-level keys in your JSON records:
The above step breaks out the key-value pairs for the specified keys into separate columns in the dataset. Each Paths to elements entry specifies a key in the JSON record, which is used to create a new column of the same name. The key's associated value becomes a cell value in the new column.
You can now delete the source column. In the example, the source column is object
.
Tip: You can choose to remove the original from the source or not. In deeper or wider JSON files, removing can help to identify what remains to be unnested. When you're done unnesting a column and have removed data from the original, you should have an empty column. |
Results:
id | score | short | title | ups | url |
---|---|---|---|---|---|
9kt8ex | 19669 | bzygw285fpp11.jpg | M/07/1'3" [23lbs > 13lbs = 10lbs] Still a bit to go, but my owner no longer refers to me as his chunky boy! | 19669 | https://i.redd.it/bzygw285fpp11.jpg |
9x2774 | 19171 | wbbufmll0cy11.jpg | M/29/5'11" [605 pounds > 375 pounds = 230 pounds lost] (14 months) Still considered super morbidly obese but I've made some good progress. | 19171 | https://i.redd.it/wbbufmll0cy11.jpg |
a8guou | 16778 | 3t0kmljnmq521.jpg | F/28/5’7” [233lbs to 130lbs] Got tired of being obese and took control of my life! | 16778 | https://i.redd.it/3t0kmljnmq521.jpg |
atla3n | 16743 | 9t6tvsjs16i21.jpg | M/22/5'11" [99lbs > 150lbs = 51lbs] Anorexia my recovery | 16743 | https://i.redd.it/9t6tvsjs16i21.jpg |
Now you can find the average score by applying average function.
Results:
id | score | short | title | ups | url | Average_score |
---|---|---|---|---|---|---|
9kt8ex | 19669 | bzygw285fpp11.jpg | M/07/1'3" [23lbs > 13lbs = 10lbs] Still a bit to go, but my owner no longer refers to me as his chunky boy! | 19669 | https://i.redd.it/bzygw285fpp11.jpg | 18090.25 |
9x2774 | 19171 | wbbufmll0cy11.jpg | M/29/5'11" [605 pounds > 375 pounds = 230 pounds lost] (14 months) Still considered super morbidly obese but I've made some good progress. | 19171 | https://i.redd.it/wbbufmll0cy11.jpg | 18090.25 |
a8guou | 16778 | 3t0kmljnmq521.jpg | F/28/5’7” [233lbs to 130lbs] Got tired of being obese and took control of my life! | 16778 | https://i.redd.it/3t0kmljnmq521.jpg | 18090.25 |
atla3n | 16743 | 9t6tvsjs16i21.jpg | M/22/5'11" [99lbs > 150lbs = 51lbs] Anorexia my recovery | 16743 | https://i.redd.it/9t6tvsjs16i21.jpg | 18090.25 |