Data Security

Data security is a big deal. Your data security is our highest priority.

Why does Django Lineage need to connect to my data store?

In short, we don’t have to if you don’t want us to. By connecting directly to your data store we’re able to infer what the shape and structure of your schemas and fields are. We can also use this information to periodically check if the shape of your data has changed over time.

What happens when I enter my password?

Only you and your browser will ever know your password. Once you enter your password is is sent over HTTPS to our servers where the first thing that we do is to encrypt the password using AWS KMS. We never save your password in plaintext. Any time that we connect to your data store we use KMS to decrypt the password on the fly.

Is my data secure?

We host our entire stack in AWS and we follow AWS best practices for securing data including:

  • Deploying all databases and services in private subnets

  • Deploying all apps and services in serverless cap

  • Leveraging HTTPS & SSL for all communication that we initiate (and we strongly encourage you to use secure protocols when connecting to your data store).

  • Highly limited access to production systems for internal associates with multi-factor authentication required for every login

Does Django Lineage store my data?

We never save your data in our system. The only exception to this is if you upload a file as part of a comment for a data store, schema or field. However, when you upload a simple file for us to infer the schema we do not save your data unless you explicitly ask us to.

When we automatically crawl the data store to identify the schemas and field of your data sometimes we’re able to get everything we need from the metadata in the database but often need to process the actual data in order to infer the schema, this is especially true for JSON type schemas such as DynamoDB or Mongo DB. In order to accomplish this we pull up to 200 records from each schema in your data store, we infer the schema, and then we discard the actual data.

One of the key features that we believe is important for any catalog is to give users hand’s-on access to the data; when we allow users to see data in the database we submit the query to your data store at the time that the user requests the data.

What else can I do to keep my data secure?

We suggest taking the following precautions when connecting directly to your data stores through Django Lineage:

  • Always create dedicated users with permissions scoped to the databases, schemas and access levels (read only) that are required

  • Use SSL and certs for connections where possible

  • Update passwords every 30 days