Skip to main content

Data Segregation

Auto Insights provides a multi-tenanted, cloud-based, SaaS solution, where multiple clients share the same underlying infrastructure. Ensuring that data remains separated across clients is of utmost importance.

This document describes the different layers of security and segregation we employ.

Application-Level Segregation

User Access Model

Application-level segregation is achieved through Auto Insights' own user management system, including a hierarchical user access model that manages Workspaces, Groups, and Users, which effectively achieves logical segregation of datasets.

Workspaces may be an individual company or business unit within a larger company. All resources in our system, including Groups, Users, and Datasets must belong to a single workspace and cannot be moved between workspaces once defined.

Groups link users to datasets under the same Workspace. Each group could have multiple users and contains permission to access multiple datasets with extensive access control, under the same workspace. In addition, each group can be configured to have different permission levels.

Users can have different permission levels to different datasets by being part of different groups. No user can share datasets between workspaces. Multiple controls are in place during the software development life cycle to ensure the user access model is adhered to. This includes (but is not limited to) code reviews, high-level architectural reviews, and targeted security reviews. Additionally, all Auto Insights software engineers receive training on secure coding practices.

User Authentication

User authentication in the application is done using a JWT token. Thus, all datasets are segregated from a user authorization point of view.

Data-Level Segregation

Customer data in Auto Insights can be classified into three categories: Source data, Analytics Data and Metadata. Source data is the raw data that was uploaded or imported into Auto Insights by users, stored as a staging file. Analytics data is a copy of the source data that has been processed, analyzed, and loaded into our Analytics database to facilitate real-time queries. Metadata is information about data that are derived by Auto Insights' algorithms, which may include results of calculations and analysis.

Source data is stored as a staging file on encrypted disks, segregated on disk by directory structure. Access to each directory is controlled based on user permissions to the dataset.

Analytics data is stored in our Analytics (OLAP) database, on encrypted disks. Access to the analytics dataset is controlled based on user permissions to the dataset, linked to a workspace.

Metadata is stored in a cloud-managed MySQL instance on encrypted disks. The table schema follows the logical model described in the preceding section, metadata belongs to a specific dataset, linked to a workspace.

Network-level Partitioning

The network-level partitioning of system is designed with a focus on security. The development and production networks are kept separate and private, with no direct connections between them.

Inbound connections to the network are carefully managed through a load balancer. This load balancer employs a firewall that only allows connections on specific, pre-determined ports, reducing potential security risks.

For outbound connections, a Network Address Translation (NAT) gateway is used. This helps in managing not only the network's outbound traffic but also adds an extra layer of security by hiding internal IP addresses from the external network.