Skip to content
icon icon Building AI Intuition

Connecting the dots...

icon icon Building AI Intuition

Connecting the dots...

  • Home
  • ML Basics
  • Model Intuition
  • Encryption
  • Privacy Tech
  • Concepts
  • Musings
  • About
  • Home
  • ML Basics
  • Model Intuition
  • Encryption
  • Privacy Tech
  • Concepts
  • Musings
  • About
Close

Search

Subscribe
icon icon Building AI Intuition

Connecting the dots...

icon icon Building AI Intuition

Connecting the dots...

  • Home
  • ML Basics
  • Model Intuition
  • Encryption
  • Privacy Tech
  • Concepts
  • Musings
  • About
  • Home
  • ML Basics
  • Model Intuition
  • Encryption
  • Privacy Tech
  • Concepts
  • Musings
  • About
Close

Search

Subscribe
Recent Posts
April 7, 2026
Exploring “Linear” in Linear Regression
April 7, 2026
The curious case of R-Squared: Keep Guessing
March 11, 2026
[C1] What Machines Actually Do (And What They Don’t)
March 11, 2026
[ML x] Machine Decision: From One Tree to a Forest
November 2, 2024
[ML 1] AI Paradigm Shift: From Rules to Patterns
November 5, 2025
[ML 1.a] ML Foundations – Linear Combinations to Logistic Regression
November 14, 2025
[ML 1.b] Teaching AI Models: Gradient Descent
November 19, 2025
[ML 2] Making Sense Of Embeddings
November 22, 2025
[ML 2.a] Word2Vec: Start of Dense Embeddings
November 28, 2025
[ML 2.b] Measuring Meaning: Cosine Similarity
December 3, 2025
[ML 2.c] Needle in the Haystack: Embedding Training and Context Rot
February 16, 2026
[MI 3] Seq2Seq Models: Basics behind LLMs
February 13, 2026
[MU 1] Advertising in the Age of AI
December 9, 2025
[EN 1.a] Breaking the “Unbreakable” Encryption – 1
December 13, 2025
[EN 1.b] Breaking the “Unbreakable” Encryption – 2
December 18, 2025
[PET 1] Privacy Enhancing Technologies – Introduction
December 21, 2025
[PET 1.a] Privacy Enhancing Technologies (PETs) — Part 1
December 25, 2025
[PET 1.b] Privacy Enhancing Technologies (PETs) — Part 2
December 30, 2025
[PET 1.c] Privacy Enhancing Technologies (PETs) — Part 3
February 2, 2026
[MI 1] An Intuitive Guide to CNNs and RNNs
November 9, 2025
[MI 2] How CNNs Actually Work
January 16, 2026
How Smart Vector Search Works
Home/Privacy Tech/[PET 1] Privacy Enhancing Technologies – Introduction
Privacy Tech

[PET 1] Privacy Enhancing Technologies – Introduction

By Archit Sharma
4 Min Read
0
Updated on March 3, 2026

Every time you browse a website, click an ad, make a purchase, or train an ML model, data flows through systems. Companies need this data — for analytics, measurement, personalization, and product improvement. But they also have legal, ethical, and business obligations to protect privacy.

This creates a fundamental tension:

How do we extract value from data while minimizing who can see what, when, and for what purpose?

Privacy Enhancing Technologies (PETs) are the technical toolkit that resolves this tension. They’re neither a magic bullet nor a single solution — they’re a layered system of techniques, each addressing a different phase of the data lifecycle.


The Data Lifecycle: Where Privacy Attacks Happen

Data moves through distinct phases, and each phase has different privacy risks. No single technology solves all phases.

PhaseRiskPETs That Help
At collectionCollecting more than neededData Minimization
Purpose Limitation
At restIdentifiers exposed in storageStorage Anonymization
Encryption (AES)
In transitData intercepted on networkTLS, Diffie-Hellman
In use / computeData exposed during processingTEE, CVM (Trust Chip Maker)
MPC (Trust Math)
Data Clean Rooms
At outputQuery results leak individualsQuery Anonymization
Differential Privacy
In measurementAttribution reveals behaviorSales Lift (in DCR)
Entropy Balancing
In MLModels memorize training dataPATE
Across partnersIdentity linked across systemsCrosswalks
Private Set Intersection (PSI)
(privacy-safe mapping)

The art of privacy engineering is layering the right techniques for each phase of your data’s journey.

* One interesting twist in the tale is that in some cases, DP can be applied at source also (input). Apple and Google insert some noise on-device especially when collecting telemetry.


The Three Layers of Privacy Protection

This guide is organized into three parts, each covering a distinct layer.

Part 1: Data Protection Fundamentals

What happens to your data inside a single organization.

These are the foundational techniques that every data system should implement:

TechniqueWhat It DoesPhase
Data MinimizationCollect only what you need, delete when doneCollection
Storage AnonymizationReplace identifiers with pseudonymsAt rest
Query AnonymizationEnforce cohort thresholds (min. users agg. per row) on outputsOutput
Differential PrivacyAdd mathematical noise guaranteesOutput

Part 1 answers: “How do I protect data within my own systems?”

Read Part 1: Data Protection Fundamentals →


Part 2: Secure Collaboration and Infrastructure

How multiple organizations work together without sharing raw data.

Modern business requires collaboration — advertisers measuring campaigns with retailers, healthcare providers conducting joint research. These techniques enable collaboration while preserving privacy:

TechniqueWhat It DoesPhase
Identity Mapping / Crosswalks / Private Set IntersectionConnect users across systems without sharing raw IDsAcross partners
Data Clean RoomsCompute joint insights in a governed environmentIn use / compute
Purpose LimitationBind data access to declared intentCollection + Use
TEE / CVMHardware-isolated computation— even admins can’t see insideIn use / compute
Diffie-Hellman + AESSecure key exchange and encryption in transitIn transit

Part 2 answers: “How do I collaborate with partners without exposing raw data?”

Read Part 2: Secure Collaboration and Infrastructure →


Part 3: Privacy-Preserving Computation and Measurement

How to compute, train ML, and measure without revealing inputs.

The most advanced layer: performing computation across parties where no single party sees the combined data, training ML models on sensitive data, and measuring business outcomes within privacy constraints:

TechniqueWhat It DoesPhase
Multi-Party Computation (MPC)Joint computation without revealing inputsIn use / compute
PATETrain ML models with differential privacyIn ML
Sales Lift / IncrementalityMeasure causal ad impact (operates within PET infrastructure)In measurement
Entropy BalancingCorrect imperfect experiments (privacy-compatible)In measurement

Part 3 answers: “How do I compute, train, and measure when even the computation itself must be private?”

Read Part 3: Privacy-Preserving Computation and Measurement →


How the Pieces Fit Together

A real-world privacy-preserving system might use all of these together. Consider an advertiser measuring campaign effectiveness with a retailer:

  1. Data Minimization: Retailer collects only purchase amount + timestamp (not full basket). Advertiser collects only ad exposure.
  2. Storage Anonymization: Both parties pseudonymize user IDs before any analysis.
  3. Identity Mapping: Establish crosswalk via identity provider — hashed mappings only, or use Private Set Intersection with double hashing.
  4. Data Clean Room: Both parties upload pseudonymized data to clean room. Clean room may run inside Confidential VM (TEE protection) for stronger privacy.
  5. Purpose Limitation: Query declares purpose: “ads measurement”. System verifies both parties’ data allows this purpose.
  6. Query Execution with Privacy: Approved aggregate query runs inside TEE. Differential privacy noise added to results. Query anonymization enforces cohort thresholds — and suppress any aggregated rows containing less users than the threshold .
  7. Measurement: Sales lift computed (exposed vs. control). Entropy balancing corrects for any group imbalances.
  8. Output: Only noisy aggregate result leaves clean room: “Campaign drove +12% incremental sales lift.” Neither party saw the other’s raw data.

Each layer catches what the previous layer missed. Together, they enable insights that would be impossible — or irresponsible — without privacy protection.


Quick Reference: Choosing the Right Technique

If you need to…Use…Covered in…
Reduce data collection footprintData MinimizationPart 1
Protect identifiers in storageStorage AnonymizationPart 1
Prevent queries from exposing individualsQuery AnonymizationPart 1
Add mathematical privacy guaranteesDifferential PrivacyPart 1
Connect users across partner systemsCrosswalks / PSI for ID MappingPart 2
Compute joint insights with partnersData Clean RoomsPart 2
Bind data to declared purposePurpose LimitationPart 2
Protect data even from cloud adminsTEE / CVMPart 2
Compute without any party seeing inputsMPCPart 3
Train ML on sensitive data privatelyPATEPart 3
Measure causal ad impactSales Lift (in DCR)Part 3
Correct imperfect experiment groupsEntropy BalancingPart 3

The Core Principle

Every technique in this guide exists to answer one question:

How do we extract value from data while minimizing who can see what, when, and for what purpose?

The answer is never a single technology. It’s a layered defense:

  • Minimize what you collect
  • Anonymize what you store
  • Protect what you compute
  • Add noise to what you output
  • Audit what you access

No layer is perfect. Each has trade-offs — flexibility, accuracy, speed, cost. The art is choosing the right combination for your use case, understanding what each layer protects, and being honest about residual risks.

Privacy engineering isn’t about perfection. It’s about thoughtful layering — building systems that extract value while genuinely protecting the individuals behind the data.


Start Reading

Part 1: Data Protection Fundamentals →

Data Minimization, Storage Anonymization, Query Anonymization, Differential Privacy

Part 2: Secure Collaboration and Infrastructure →

Identity Mapping, Data Clean Rooms, Purpose Limitation, TEE/CVM, Encryption

Part 3: Privacy-Preserving Computation and Measurement →

Multi-Party Computation, PATE, Sales Lift, Entropy Balancing

Related Posts:

  • [PET 1.a] Privacy Enhancing Technologies (PETs) — Part 1
  • [ML x] Machine Decision: From One Tree to a Forest
  • [PET 1.c] Privacy Enhancing Technologies (PETs) — Part 3
  • [PET 1.b] Privacy Enhancing Technologies (PETs) — Part 2
  • [C1] What Machines Actually Do (And What They Don't)
  • [ML 1] AI Paradigm Shift: From Rules to Patterns

Tags:

aiartificial-intelligencechatgptcybersecurityprivacy-enhancing-technologiesprivacy-preserving-technologiestechnology
Author

Archit Sharma

Follow Me
Other Articles
Previous

[PET 1.a] Privacy Enhancing Technologies (PETs) — Part 1

Next

[EN 1.b] Breaking the “Unbreakable” Encryption – 2

No Comment! Be the first one.

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Categories

icons8 pencil 100
ML Basics

Back to the basics

screenshot 1
Model Intuition

Build model intuition

icons8 lock 100 (1)
Encryption

How encryption works

icons8 gears 100
Privacy Tech

What protects privacy

screenshot 4
Musings

Writing is thinking

Recent Posts

  • Exploring “Linear” in Linear Regression
  • The curious case of R-Squared: Keep Guessing
  • [C1] What Machines Actually Do (And What They Don’t)
  • [ML x] Machine Decision: From One Tree to a Forest
  • [MI 3] Seq2Seq Models: Basics behind LLMs
  • [MU 1] Advertising in the Age of AI
  • [MI 1] An Intuitive Guide to CNNs and RNNs
  • How Smart Vector Search Works
  • [PET 1.c] Privacy Enhancing Technologies (PETs) — Part 3
  • [PET 1.b] Privacy Enhancing Technologies (PETs) — Part 2
Copyright 2026 — Building AI Intuition. All rights reserved. Blogsy WordPress Theme