Secure Data Flow
While data exchange on the public internet is protected by robust protocols like TLS. Most companies handle sensitive user data internally implement much loose data security and encryption standards. It's understandable. First, it's very complicated. Second, without shared process and standards, it's very costly to implement and maintain. We build Keyper with the goal to simplify and streamline data security and encryption implementations for product and data engineers.
Direct AES + ACLs Encryption Flow¶
This is the simplest approach with the least overhead introduced to both the client and backend.
Instead of using MAC, this flow implements identity and access management using existing cloud IAM solutions such as AWS IAM, GCP IAM etc. as opposed to data integrity checks. IAM is much more common and widely adopted in most companies' internal data management and security practices and introduce minimal change to existing implementations.
sequenceDiagram
autonumber
participant Client
participant Backend
participant KMS
participant Data Store
Client->>Backend: Send raw data
Backend->>KMS: Create AES key
Backend->>Backend: Encrypt raw data with AES key
Backend->>Backend: Add ACLs for data
Backend->>Data Store: Store encrypted data
Data Store->>Backend: Retrieve encrypted data
Backend->>Backend: Verify ACLs
Backend->>Backend: Decrypt data with AES key
Backend->>Client: Return raw data
Pros and Cons¶
Pros
- Secure AES Key Management: The AES key is managed by the backend applications. Backend applications are generally more secure than the client as they typically have multiple layers of security (network, application, cloud infrastructure, etc.) that can more effectively prevent AES key leakage.
- Minimal Overhead: This approach introduces the least overhead for both the client and backend, making it efficient and straightforward to implement.
Cons
- Access to Raw Data: The backend application has access to the actual AES key and raw data, which means that for highly sensitive data, the backend itself could be a potential point of compromise.
- Trust Issues: Since the backend can see the raw data, it cannot be fully trusted in scenarios where data confidentiality is important.
- Misconfiguration Risks: Both the AES key management and IAM/ACLs policies could be misconfigured or mishandled, leading to potential security vulnerabilities.
Use Cases¶
Because of the simplicity and low overhead nature of this flow, it's ideal for internal data management to provide extra protection on top of existing IAM and permission policies for added security in case there are misconfigurations or management issues in other layers. For example, if a retention policy is set to infinite but encryption keys are rotated, the data will technically no longer be accessible. Another common example is that even though legally logs are required to be retained for 7-10 years, only certain personnel have permission to decrypt data with the key for legal reasons.
- Log Storage: Encrypt sensitive application logs.
- Client Data Storage: Encrypt client data for client data collected by application.
- Data Processing Pipeline: Encrypt data in cloud based, distributed data processing pipelines.
Hybrid RSA + AES Encryption Flow¶
This approach adds overhead to the client compared to the Direct AES + ACLs Encryption Flow, as the client must encrypt the data with the RSA public key. This provides added security by ensuring the client doesn't send raw data to the backend. Additional implementations such as MAC can be added to ensure data integrity between the client and backend.
It adds overhead to the client comparing to direct AES + ACLs encryption flow as it has to encrypt the data with RSA public key to provide added security where client doesn't send raw data to backend. It's possible to add additional implementations such as MAC to ensure between client and backend, data is not tempered.
sequenceDiagram
autonumber
participant Client
participant Backend
participant KMS
participant Data Store
Backend->>KMS: Create RSA key
Client->>KMS: Retrieve RSA public key
Client->>Client: Encrypt data with RSA public key
Client->>Backend: Send encrypted data
Backend->>Backend: Decrypt data with RSA private key
Backend->>KMS: Create AES key
Backend->>Backend: Encrypt raw data with AES key
Backend->>Data Store: Store encrypted data
Data Store->>Backend: Retrieve encrypted data
Backend->>Backend: Decrypt data with AES key
Backend->>Client: Return decrypted data*
*Repeat the same process to encrypt/decrypt with RSA key between backend and client
Pros and Cons¶
Pros
- Enhanced Security: By using RSA for the initial encryption, the data is protected during transmission. The use of AES for storage ensures efficient encryption.
- Separate Key Management: RSA and AES keys are managed separately, adding an additional layer of security to reduce the risk of data leakage.
- Scalability: Suitable for systems that require secure data transmission and storage, allowing for scalable key management practices.
- Compliance: Meets regulatory requirements for the secure handling of sensitive data.
Cons
- Backend Access to Data: The backend can see the raw data once it is decrypted with the RSA private key, which might not be ideal for highly sensitive data.
- Added Complexity and Overhead: Managing both RSA and AES keys adds more complexity and performance overhead compared to Direct AES + ACLs Encryption Flow.
Use Cases¶
- Sensitive Data Transmission: Securely transmit sensitive data, such as credit card numbers or social security numbers, from the client to the backend.
- Secure Form Submission: Ensure secure transmission of sensitive form data (e.g., online application forms containing personal information) from the client to the backend.
AES + RSA + TEE Encryption Flow¶
This approach combines Hybrid RSA + AES Encryption Flow with Trusted Execution Environment (TEE) to enhance data security. The client encrypts the data with the RSA public key before sending it to the backend, which then sends it to the TEE. Within the TEE, the data is decrypted using the RSA private key, re-encrypted with an AES key, and sent to the backend to store in the data store. When the data is needed, the process is reversed, ensuring that sensitive data remains protected throughout the entire workflow.
This flow adds a lot of overhead compared to the Direct AES + ACLs Encryption Flow and Hybrid RSA + AES Encryption Flow as it needs to spin up an independent TEE to manage the encryption and decryption. However, the advantage of this approach is that the backend never has access to the data or the AES key. Data is tightly protected throughout the entire data transmission flow.
sequenceDiagram
autonumber
participant Client
participant Backend
participant KMS
participant TEE
participant Data Store
Backend->>KMS: Create RSA and AES keys
Client->>KMS: Retrieve RSA public key
Client->>Client: Encrypt data with RSA public key
Client->>Backend: Send encrypted data
Backend->>TEE: Send encrypted data and client key references
TEE->>KMS: Retrieve RSA private and AES keys
TEE->>TEE: Decrypt data with RSA private key
TEE->>TEE: Encrypt raw data with AES key
TEE->>Backend: Send encrypted data
Backend->>Data Store: Store encrypted data
Data Store->>Backend: Retrieve encrypted data
Backend->>TEE: Send encrypted data
TEE->>KMS: Retrieve RSA public and AES keys
TEE->>TEE: Decrypt data with AES key
TEE->>TEE: Encrypt data with RSA public key
TEE->>Backend: Send encrypted data
Backend->>Client: Return encrypted data
Client->>Client: Decrypt data with RSA private key
Pros and Cons¶
Pros
- Enhanced Security: Combining RSA and AES with TEE provides a highly secure environment for handling sensitive data.
- Data Confidentiality: The TEE ensures that the decrypted data and AES key are only accessible within the secure enclave, protecting against unauthorized access. Raw data and AES key is never exposed outside TEE and client.
- Integrity and Trust: The use of TEE guarantees that data processing is performed in a trusted environment, ensuring data integrity and trustworthiness.
- Regulatory Compliance: Meets the most strict regulatory requirements for the secure handling of sensitive data.
Cons
- Complexity and Cost: Implementing and managing TEE, along with RSA and AES encryption, introduces significant complexity and cost.
- Performance Overhead: The additional security measures, including TEE, introduce significant performance overhead compared to simpler encryption methods.
Use Cases¶
- Financial Data Protection: Securely handle and store sensitive financial information, such as credit card details and banking information, ensuring data confidentiality and integrity.
- Healthcare Records Management: Protect sensitive healthcare records, ensuring they are processed and stored in a highly secure and trusted environment.
- Regulated Industries: Suitable for industries with strict regulatory requirements for data protection, such as government and defense, ensuring compliance and data security.
Comparison¶
Feature | Direct AES + ACLs Encryption Flow | Hybrid RSA + AES Encryption Flow | AES + RSA + TEE Encryption Flow |
---|---|---|---|
Encryption Method | AES for encryption, IAM for access control | RSA for initial encryption, AES for storage | RSA for initial encryption, AES for storage, TEE for processing |
Data Transmission | Raw data sent to backend | Encrypted with RSA, then AES | Encrypted with RSA, processed in TEE, then AES |
Backend Access to Data | Backend can access raw data | Backend can access data after RSA decryption | Backend never accesses raw data or AES key |
Security Level | Basic encryption and access control | Enhanced security with separate key management | Highest security with TEE, ensuring data confidentiality |
Performance Overhead | Minimal | Moderate due to dual encryption | High due to TEE and dual encryption |
Complexity | Low | Moderate | High due to TEE implementation |
Use Cases | Internal data management, log storage, client data storage | Sensitive data transmission, secure form submission | Financial data protection, healthcare records, regulated industries |