You can integrate data from other sources into your current dataset. Based on a key column that you identify in the lookup dataset, you can insert the corresponding values in other columns of the lookup dataset as new columns in your source dataset. |
Tip: Column lookups are useful for adding reference data based on a column's values. |
For example, your data contains the two-letter abbreviations for U.S. states, yet the target system is expecting the full name of each state. You need to replace the XY
state abbreviation with the full name of each state in each row.
Your data table should like the following:
State-2Letter | State |
---|---|
AL | Alabama |
AK | Alaska |
AZ | Arizona |
AR | Arkansas |
CA | California |
CO | Colorado |
CT | Connecticut |
DE | Delaware |
DC | District of Columbia |
FL | Florida |
GA | Georgia |
HI | Hawaii |
ID | Idaho |
IL | Illinois |
IN | Indiana |
IA | Iowa |
KS | Kansas |
KY | Kentucky |
LA | Louisiana |
ME | Maine |
MD | Maryland |
MA | Massachusetts |
MI | Michigan |
MN | Minnesota |
MS | Mississippi |
MO | Missouri |
MT | Montana |
NE | Nebraska |
NV | Nevada |
NH | New Hampshire |
NJ | New Jersey |
NM | New Mexico |
NY | New York |
NC | North Carolina |
ND | North Dakota |
OH | Ohio |
OK | Oklahoma |
OR | Oregon |
PA | Pennsylvania |
RI | Rhode Island |
SC | South Carolina |
SD | South Dakota |
TN | Tennessee |
TX | Texas |
UT | Utah |
VT | Vermont |
VA | Virginia |
WA | Washington |
WV | West Virginia |
WI | Wisconsin |
WY | Wyoming |
Tip: You can download a version of this table, which also includes some timezone information. See Dict-TimezoneByState.csv. |
This data table must be uploaded as a new dataset.
Steps:
State_2Letter
.See Lookup Wizard.
The CSV linked above also contains timezone information for each state, which you can use to provide higher fidelity information on timestamps.
U.S. timezones are not consistently demarcated by state lines. Some states are split across multiple timezones. For more accurate representation of timezones, you should download and use a zipcode database, many of which are freely available online. This CSV is provided for demonstration purposes only. |
In this case, you are working with a dataset that contains timestamps, which are stored in different timezones based on the location where an event or transaction occurred. However, the timestamps do not contain any timezone information.
You can use an external source of timezone information to insert timezones into your dataset. In the following example, timezones are derived based on two-letter abbreviations for U.S. state. A more accurate representation would be based on zipcode data.
Steps:
Time Offsets
column identifies the predominant timezone in each state as an offset of the UTC timezone (Greenwich Mean Time).NOTE: Depending on the requirements of your target system, you can use the Split transformation to break up column data so that only the numerical offset (e.g. |