You can create connections to MongoDB and MongoDB Atlas connections through .  These connections enable to read data from the MongoDB workspace.


If you are connecting to any relational source of data, you must add the to your whitelist for those resources.For more information, see Getting Started with Cloud Dataprep.

Prerequisites

Limitations

Create Connection

MongoDB

To create a MongoDB connection, please specify the following properties:

PropertyDescription
Host

Name of the host.

PortSet this value to the port number through which to access MongoDB. By default, this value is 27017.
Database

The database that you want to read

Auth Database

Name of the MongoDB database used for authentication

Replica Set

(Optional) Comma-separated list of secondary servers in the replica set, specified by address and port.

replica set  is a group of mongoDB  processes that maintain the same data set. Replica sets provide redundancy and high availability and are the basis for all production deployments.  For more information, see https://docs.mongodb.com/manual/replication/.

Secondary ReadsEnable this checkbox if you want to read from secondary (slave) servers.
Use SSLEnable this checkbox if you want to connect using SSL.
Connect String Options

(Optional) You can specify additional options used to connect as a string value.

The following option sets the connection timeout in milliseconds:

Timeout=0;

The default value is 0, which disables connection timeouts. See below for more information.

Test Connection

After you have defined the connection credentials type, credentials, and connection string, you can verify that the can use them to connect to the database.

Default Column Data Type Inference

Set to disabled to prevent the product from applying its own type inference to each column on import. The default value is enabled.

Connection NameDisplay name of the connection
Connection Description(Optional) Description of the connection, which appears in the application.

MongoDB Atlas

To create a MongoDB Atlas connection, please specify the following properties:

PropertyDescription
Host

Name of the host.

PortSet this value to the port number through which to access MongoDB. By default, this value is 27017.
Database

The database that you want to read

Replica Set

(Optional) Comma-separated list of secondary servers in the replica set, specified by address and port.

replica set  is a group of mongoDB  processes that maintain the same data set. Replica sets provide redundancy and high availability and are the basis for all production deployments.  For more information, see https://docs.mongodb.com/manual/replication/.

Secondary ReadsEnable this checkbox if you want to read from secondary (slave) servers.
Connect String Options

(Optional) The option sets the connection timeout in milliseconds:

Timeout=0;

The default value is 0, which disables connection timeouts. See below for more information.

Test Connection

After you have defined the connection credentials type, credentials, and connection string, you can verify that the can use them to connect to the database.

Default Column Data Type Inference

Set to disabled to prevent the product from applying its own type inference to each column on import. The default value is enabled.

Connection NameDisplay name of the connection
Connection Description(Optional) Description of the connection, which appears in the application.

For more information on these settings, see http://cdn.cdata.com/help/RCF/jdbc/default.htm.

Create connection via API

Depending on your product edition, you can create connections of this type.

MongoDB:

"vendor": "mongodb",
"vendorName": "MongoDB",
"type": "jdbc"

MongoDB Atlas:

"vendor": "mongodb_atlas",
"vendorName": "MongoDB Atlas",
"type": "jdbc"


operation/createConnection

Connect string options

Connection timeout

By default, the supported driver applies a connection timeout to MongoDB of 0 seconds. As needed, you can modify the connection timeout through connect string options:

Timeout=<value_in_seconds>;

where:

<value_in_seconds> corresponds to the number of seconds for the time. 

Flattening Documents

Documents can contain other documents, which enables the storage of nested data. You can control the flattening of nested objects and arrays through the CData driver through Connect String Options.

NOTE: Columns that have been flattened can be accessed or referenced using custom SQL queries. Additional information is below.


Flatten Objects:

By default, the CData driver flattens nested Objects. As needed, you can set FlattenObjects to  false  to disable this behavior.

For more information, see http://cdn.cdata.com/help/DGF/jdbc/RSBMongodb_p_FlattenObjects.htm.

Flatten Arrays:

By default, CData driver does not flatten Arrays.

For more information, see http://cdn.cdata.com/help/DGF/jdbc/RSBMongodb_p_FlattenArrays.htm.

Referencing flattened columns:

If you have flattened Objects or Arrays, you can reference these columns using square brackets in your custom SQL queries.

Example of flattened Object:

SELECT [address.city] FROM my_table;

Example of flattened Array:

SELECT * FROM my_table WHERE [hobbies.0]='cricket';


Driver Information

For more information on CData JDBC drivers, see http://cdn.cdata.com/help/DGF/jdbc/default.htm.

Using MongoDB

MongoDB is a NoSQL document database that provides high performance, availability, and scalability. 

MongoDB Data Organization Hierarchy

MongoDb has a two-level data hierarchy:

+ Schema1
  + Collection1
  + Collection2
+ Schema2
  + Collection3
  + Collection4

Database Uses

For more information on interacting with databases, see Using Databases.

Read Data

You can import datasets from MongoDB through the Import Data page. See  Import Data Page .

Data Type Mappings

NOTE: The listed in this section reflect the raw data type of the converted column. Depending on the contents of the column, the Transformer Page may re-infer a different data type, when a dataset using this type of source is loaded.

Access/Read

When data is imported from MongoDB, the supported data types from the source are converted to corresponding data types supported by the . For more information, see Type Conversions.

Source Data TypeSupported

ObjectIdY

String

RegExYString
StringYString
BinaryYString
IntegerYInteger
TimestampYDatetime
DoubleYFloat
ArrayYString
BoolYbool
NullYString
DateYDatetime

Write/Publish

Not supported.